Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

is it possible to let siri monitor phone calls, and notify me when a certain trigger happens?
the specific context is that i would like to build an agent that monitors my phone call (with a customer support for example), and simiply identify whether or not im still put on hold, and notify me when im not. currently after reading the doc, i dont think its possible yet, but im so annoyed by the customer support calls that im willing to go the distance and see if theres any way.
0
0
167
Jun ’25
Detection of balls about 6-10ft Away not detecting
I used Yolo5-11 and while performing great detecting balls lets say 5-10ft away in 1920 resolution and even in 640 it really is taking toll on my app performance. When I use Create ML it outputs all in 415x which is probably the reason why it does not detect objects from far. What can I do to preserve some energy ? My model is used with about 1K pictures 200 each test and validate, and from close up and far.
0
2
247
Apr ’25
Full documentation of annotations file for Create ML
The documentation for the Create ML tool ("Building an object detector data source") mentions that there are options for using normalized values instead of pixels and also different anchor point origins ("MLBoundingBoxCoordinatesOrigin") instead of always using "center". However, the JSON format for these does not appear in any examples. Does anyone know the format for these options?
0
1
264
May ’25
MLX/Ollama Benchmarking Suite - Open Source and Free
Hi all, I spent the last few months developing an MLX/Ollama local AI Benchmarking suite for Apple Silicon, written in pure Swift and signed with an Apple Developer Certificate, open source, GPL, and free. I would love some feedback to continue development. It is the only benchmarking suite I know of that supports live power metrics and MLX natively, as well as quick exports for benchmark results, and an arena mode, Model A vs B with history. I really want this project to succeed, and have widespread use, so getting 75 stars on the github repo makes it eligible for Homebrew/Cask distribution. Github Repo
0
0
170
Feb ’26
How to Ensure Controlled and Contextual Responses Using Foundation Models ?
Hi everyone, I’m currently exploring the use of Foundation models on Apple platforms to build a chatbot-style assistant within an app. While the integration part is straightforward using the new FoundationModel APIs, I’m trying to figure out how to control the assistant’s responses more tightly — particularly: Ensuring the assistant adheres to a specific tone, context, or domain (e.g. hospitality, healthcare, etc.) Preventing hallucinations or unrelated outputs Constraining responses based on app-specific rules, structured data, or recent interactions I’ve experimented with prompt, systemMessage, and few-shot examples to steer outputs, but even with carefully generated prompts, the model occasionally produces incorrect or out-of-scope responses. Additionally, when using multiple tools, I'm unsure how best to structure the setup so the model can select the correct pathway/tool and respond appropriately. Is there a recommended approach to guiding the model's decision-making when several tools or structured contexts are involved? Looking forward to hearing your thoughts or being pointed toward related WWDC sessions, Apple docs, or sample projects.
0
0
136
Jul ’25
CoreML Model Conversion Help
I’m trying to follow Apple’s “WWDC24: Bring your machine learning and AI models to Apple Silicon” session to convert the Mistral-7B-Instruct-v0.2 model into a Core ML package, but I’ve run into a roadblock that I can’t seem to overcome. I’ve uploaded my full conversion script here for reference: https://pastebin.com/T7Zchzfc When I run the script, it progresses through tracing and MIL conversion but then fails at the backend_mlprogram stage with this error: https://pastebin.com/fUdEzzKM The core of the error is: ValueError: Op "keyCache_tmp" (op_type: identity) Input x="keyCache" expects list, tensor, or scalar but got state[tensor[1,32,8,2048,128,fp16]] I’ve registered my KV-cache buffers in a StatefulMistralWrapper subclass of nn.Module, matching the keyCache and valueCache state names in my ct.StateType definitions, but Core ML’s backend pass reports the state tensor as an invalid input. I’m using Core ML Tools 8.3.0 on Python 3.9.6, targeting iOS18, and forcing CPU conversion (MPS wasn’t available). Any pointers on how to satisfy the handle_unused_inputs pass or properly declare/cache state for GQA models in Core ML would be greatly appreciated! Thanks in advance for your help, Usman Khan
0
0
301
May ’25
ML contraints & Timeout clarificaitions for Message Filtering Extension
Hello everyone, I’m currently working with the Message Filtering Extension and would really appreciate some clarification around its performance and operational constraints. While the extension is extremely powerful and useful, I’ve found that some important details are either unclear or not well covered in the available documentation. There are two main areas I’m trying to understand better: Machine learning model constraints within the extension In our case, we already have an existing ML model that classifies messages (and are not dependant on Apple's built-in models). We’re evaluating whether and how it can be used inside the extension. Specifically, I’m trying to understand: Are there documented limits on the size of an ML model (e.g., maximum bundle size or model file size in MB)? What are the memory constraints for a model once loaded into memory by the extension? Under what conditions would the system terminate or “kick out” the extension due to memory or performance pressure? Message processing timeouts and execution constraints What is the timeout for processing a single received message? At what point will the OS stop waiting for the extension’s response and allow the message by default (for example, if the extension does not respond in time)? Any guidance, official references, or practical experience from Apple engineers or other developers would be greatly appreciated. Thanks in advance for your help,
0
0
262
Jan ’26
Apple OCR framework seems to be holding on to allocations every time it is called.
Environment: macOS 26.2 (Tahoe) Xcode 16.3 Apple Silicon (M4) Sandboxed Mac App Store app Description: Repeated use of VNRecognizeTextRequest causes permanent memory growth in the host process. The physical footprint increases by approximately 3-15 MB per OCR call and never returns to baseline, even after all references to the request, handler, observations, and image are released. ` private func selectAndProcessImage() { let panel = NSOpenPanel() panel.allowedContentTypes = [.image] panel.allowsMultipleSelection = false panel.canChooseDirectories = false panel.message = "Select an image for OCR processing" guard panel.runModal() == .OK, let url = panel.url else { return } selectedImageURL = url isProcessing = true recognizedText = "Processing..." // Run OCR on a background thread to keep UI responsive let workItem = DispatchWorkItem { let result = performOCR(on: url) DispatchQueue.main.async { recognizedText = result isProcessing = false } } DispatchQueue.global(qos: .userInitiated).async(execute: workItem) } private func performOCR(on url: URL) -> String { // Wrap EVERYTHING in autoreleasepool so all ObjC objects are drained immediately let resultText: String = autoreleasepool { // Load image and convert to CVPixelBuffer for explicit memory control guard let imageData = try? Data(contentsOf: url) else { return "Error: Could not read image file." } guard let nsImage = NSImage(data: imageData) else { return "Error: Could not create image from file data." } guard let cgImage = nsImage.cgImage(forProposedRect: nil, context: nil, hints: nil) else { return "Error: Could not create CGImage." } let width = cgImage.width let height = cgImage.height // Create a CVPixelBuffer from the CGImage var pixelBuffer: CVPixelBuffer? let attrs: [String: Any] = [ kCVPixelBufferCGImageCompatibilityKey as String: true, kCVPixelBufferCGBitmapContextCompatibilityKey as String: true ] let status = CVPixelBufferCreate( kCFAllocatorDefault, width, height, kCVPixelFormatType_32ARGB, attrs as CFDictionary, &pixelBuffer ) guard status == kCVReturnSuccess, let buffer = pixelBuffer else { return "Error: Could not create CVPixelBuffer (status: \(status))." } // Draw the CGImage into the pixel buffer CVPixelBufferLockBaseAddress(buffer, []) guard let context = CGContext( data: CVPixelBufferGetBaseAddress(buffer), width: width, height: height, bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(buffer), space: CGColorSpaceCreateDeviceRGB(), bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue ) else { CVPixelBufferUnlockBaseAddress(buffer, []) return "Error: Could not create CGContext for pixel buffer." } context.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height)) CVPixelBufferUnlockBaseAddress(buffer, []) // Run OCR let requestHandler = VNImageRequestHandler(cvPixelBuffer: buffer, options: [:]) let request = VNRecognizeTextRequest() request.recognitionLevel = .accurate request.usesLanguageCorrection = true do { try requestHandler.perform([request]) } catch { return "Error during OCR: \(error.localizedDescription)" } guard let observations = request.results, !observations.isEmpty else { return "No text found in image." } let lines = observations.compactMap { observation in observation.topCandidates(1).first?.string } // Explicitly nil out the pixel buffer before the pool drains pixelBuffer = nil return lines.joined(separator: "\n") } // Everything — Data, NSImage, CGImage, CVPixelBuffer, VN objects — released here return resultText } `
0
0
166
Feb ’26
Is Jax for Apple Silicon is still supported
Hi From https://aninterestingwebsite.com/metal/jax/ I checked all active workflows on https://github.com/jax-ml/jax and any open issues with tags Metal and seems in DEC 2025 the Jax maintainers have closed all issues citing No active development on Jax-metal and the project seems dead. We need to know how can we leverage Apple silicon for accelerated projects using popular academia library and tools . Is the JAX project still going to be supported or Apple has plans to bring something of tis own that might be platform agnostic . Thanks
0
0
160
Feb ’26
Hardware Support for Low Precision Data Types?
Hi all, I'm trying to find out if/when we can expect mxfp8/mxfp4 support on Apple Silicon. I've noticed that mlx now has casting data types, but all computation is still done in bf16. Would be great to reduce power consumption with support for these lower precision data types since edge inference is already typically done at a lower precision! Thanks in advance.
0
0
316
Nov ’25
Request: Official One-Click Local LLM Deployment for 2019 Mac Pro (7,1) Dual W6900X
I am a professional user of the 2019 Mac Pro (7,1) with dual AMD Radeon Pro W6900X MPX modules (32GB VRAM each). This hardware is designed for high-performance compute, but it is currently crippled for modern local LLM/AI workloads under Linux due to Apple's EFI/PCIe routing restrictions. Core Issue: rocminfo reports "No HIP GPUs available" when attempting to use ROCm/amdgpu on Linux Apple's custom EFI firmware blocks full initialization of professional GPU compute assets The dual W6900X GPUs have 64GB combined VRAM and high-bandwidth Infinity Fabric Link, but cannot be fully utilized for local AI inference/training My Specific Request: Apple should provide an official, one-click deployable application that enables full utilization of dual W6900X GPUs for local large language model (LLM) inference and training under Linux. This application must: Fully initialize both W6900X GPUs via HIP/ROCm, establishing valid compute contexts Bypass artificial EFI/PCIe routing restrictions that block access to professional GPU resources Provide a stable, user-friendly one-click deployment experience (similar to NVIDIA's AI Enterprise or AMD's ROCm Hub) Why This Matters: The 2019 Mac Pro is Apple's flagship professional workstation, marketed for compute-intensive workloads. Its high-cost W6900X GPUs should not be locked down for modern AI/LLM use cases. An official one-click deployment solution would demonstrate Apple's commitment to professional AI and unlock significant value for professional users. I look forward to Apple's response and a clear roadmap for enabling this critical capability. #MacPro #Linux #ROCm #LocalLLM #W6900X #CoreML
0
0
92
1w
Core-ml-on-device-llama Converting fails
I followed below url for converting Llama-3.1-8B-Instruct model but always fails even i have 64GB of free space after downloading model from huggingface. https://machinelearning.apple.com/research/core-ml-on-device-llama Also tried with other models Llama-3.1-1B-Instruct & Llama-3.1-3B-Instruct models those are converted but while doing performance test in xcode fails for all compunits. Is there any source code to run llama models in ios app.
0
0
233
Apr ’25
Various On-Device Frameworks API & ChatGPT
Posting a follow up question after the WWDC 2025 Machine Learning AI & Frameworks Group Lab on June 12. In regards to the on-device API of any of the AI frameworks (foundation model, vision framework, ect.), is there a response condition or path where the API outsources it's input to ChatGPT if the user has allowed this like Siri does? Ignore this if it's a no: is this handled behind the scenes or by the developer?
0
0
324
Jun ’25
recent JAX versions fail on Metal
Hi, I'm not sure whether this is the appropriate forum for this topic. I just followed a link from the JAX Metal plugin page https://aninterestingwebsite.com/metal/jax/ I'm writing a Python app with JAX, and recent JAX versions fail on Metal. E.g. v0.8.2 I have to downgrade JAX pretty hard to make it work: pip install jax==0.4.35 jaxlib==0.4.35 jax-metal==0.1.1 Can we get an updated release of jax-metal that would fix this issue? Here is the error I get with JAX v0.8.2: WARNING:2025-12-26 09:55:28,117:jax._src.xla_bridge:881: Platform 'METAL' is experimental and not all JAX functionality may be correctly supported! WARNING: All log messages before absl::InitializeLog() is called are written to STDERR W0000 00:00:1766771728.118004 207582 mps_client.cc:510] WARNING: JAX Apple GPU support is experimental and not all JAX functionality is correctly supported! Metal device set to: Apple M3 Max systemMemory: 36.00 GB maxCacheSize: 13.50 GB I0000 00:00:1766771728.129886 207582 service.cc:145] XLA service 0x600001fad300 initialized for platform METAL (this does not guarantee that XLA will be used). Devices: I0000 00:00:1766771728.129893 207582 service.cc:153] StreamExecutor device (0): Metal, <undefined> I0000 00:00:1766771728.130856 207582 mps_client.cc:406] Using Simple allocator. I0000 00:00:1766771728.130864 207582 mps_client.cc:384] XLA backend will use up to 28990554112 bytes on device 0 for SimpleAllocator. Traceback (most recent call last): File "<string>", line 1, in <module> import jax; print(jax.numpy.arange(10)) ~~~~~~~~~~~~~~~~^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/numpy/lax_numpy.py", line 5951, in arange return _arange(start, stop=stop, step=step, dtype=dtype, out_sharding=sharding) File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/numpy/lax_numpy.py", line 6012, in _arange return lax.broadcasted_iota(dtype, (size,), 0, out_sharding=out_sharding) ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/lax/lax.py", line 3415, in broadcasted_iota return iota_p.bind(dtype=dtype, shape=shape, ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^ dimension=dimension, sharding=out_sharding) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/core.py", line 633, in bind return self._true_bind(*args, **params) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/core.py", line 649, in _true_bind return self.bind_with_trace(prev_trace, args, params) ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/core.py", line 661, in bind_with_trace return trace.process_primitive(self, args, params) ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/core.py", line 1210, in process_primitive return primitive.impl(*args, **params) ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/dispatch.py", line 91, in apply_primitive outs = fun(*args) jax.errors.JaxRuntimeError: UNKNOWN: -:0:0: error: unknown attribute code: 22 -:0:0: note: in bytecode version 6 produced by: StableHLO_v1.13.0 -------------------- For simplicity, JAX has removed its internal frames from the traceback of the following exception. Set JAX_TRACEBACK_FILTERING=off to include these. I0000 00:00:1766771728.149951 207582 mps_client.h:209] MetalClient destroyed.
0
0
582
Dec ’25
ANE Performance for on-device Foundation model
I'm running MacOs 26 Beta 5. I noticed that I can no longer achieve 100% usage on the ANE as I could before with Apple Foundations on-device model. Has Apple activated some kind of throttling or power limiting of the ANE? I cannot get above 3w or 40% usage now since upgrading. I'm on the high power energy mode. I there an API rate limit being applied? I kave a M4 Pro mini with 64 GB of memory.
0
0
343
Aug ’25
Gazetteer encryption?
I have an app that uses a couple of mlmodels (word tagger and gazetteer) and I’m trying to encrypt them before publishing. The models are part of a package. I understand that Xcode can’t automatically handle the encryption for a model in a package the way it can within a traditional app structure. Given that, I’ve generated the Apple MLModel encryption key from Xcode and am encrypting via the command line with: xcrun coremlcompiler compile Gazetteer.mlmodel GazetteerENC.mlmodelc --encrypt Gazetteerkey.mlmodelkey In the package manifest, I’ve listed the encrypted models as .copy resources for my target and have verified the URL to that file is good. When I try to load the encrypted .mlmodelc file (on a physical device) with the line:
 gazetteer = try NLGazetteer(contentsOf: gazetteerURL!) I get the error: Failed to open file: /…/Scanner.bundle/GazetteerENC.mlmodelc/coremldata.bin. It is not a valid .mlmodelc file. So my questions are: Does the NLGazetteer class support encrypted MLModel files? Given that my models are in a package, do I have the right general approach? Thanks for any help or thoughts.
0
0
162
May ’25
SoundAnalysis built-in classifier fails in background (SNErrorCode.operationFailed)
I’m seeing consistent failures using SoundAnalysis live classification when my app moves to the background. Setup iOS 17.x AVAudioEngine mic capture SNAudioStreamAnalyzer SNClassifySoundRequest(classifierIdentifier: .version1) UIBackgroundModes = audio AVAudioSession .record / .playAndRecord, active Audio capture + level metering continue working in background (mic indicator stays on) Issue As soon as the app enters background / screen locks: SoundAnalysis starts failing every second with domain:com.apple.SoundAnalysis, code:2(SNErrorCode.operationFailed) Audio capture itself continues normally When the app returns to foreground, classification immediately resumes without restarting the engine/analyzer Question Is live background sound classification with the built-in SoundAnalysis classifier officially unsupported or known to fail in background? If so, is a custom Core ML model the only supported approach for background detection? Or is there a required configuration I’m missing to keep SNClassifySoundRequest(.version1) running in background? Thanks for any clarification.
0
1
223
Dec ’25
How to test for VisualIntelligence available on device?
I'm adding Visual Intelligence support to my app, and now want to add a Tip using TipKit to guide users to this feature from within my app. I want to add a Rule to my Tip which will only show this Tip on devices where Visual Intelligence is supported (ex. not iPhone 14 Pro Max). What is the best way for me to determine availability to set this TipKit rule? Here's the documentation I'm following for Visual Intelligence: https://aninterestingwebsite.com/documentation/visualintelligence/integrating-your-app-with-visual-intelligence
0
0
739
Sep ’25
How does ARKit achieve low-latency and stable head tracking using only RGB camera ?
Hi, I’m working on a real-time head/face tracking pipeline using a standard 2D RGB camera, and I’m trying to better understand how ARKit achieves such stable and responsive results in comparable conditions. To clarify upfront: I’m specifically interested in RGB-only tracking and the underlying vision/ML pipeline. I’m not using TrueDepth or any depth/IR-based sensors, and I’d like to understand how similar stability and responsiveness can be achieved under those constraints. In my current setup, I estimate head pose from RGB frames (facial landmarks + PnP) and apply temporal filtering (e.g., exponential smoothing and Kalman filtering). This significantly reduces jitter, but introduces noticeable latency, especially during faster head movements. What stands out in ARKit is that it appears to maintain both: Very low jitter Very low perceived latency even when operating with camera input alone. I’m trying to understand what techniques might contribute to this behavior. In particular: Does ARKit use predictive tracking (e.g., velocity or acceleration-based pose extrapolation) to compensate for camera and processing delays in RGB-only scenarios? Are there recommended strategies for balancing temporal smoothing and responsiveness without introducing visible lag in camera-based pose estimation pipelines? Is the tracking pipeline internally decoupled from rendering (e.g., asynchronous processing with prediction applied at render time)? Are there general best practices for minimizing end-to-end latency in vision-based head tracking systems beyond standard filtering approaches? I understand that implementation details may not be public, but any high-level insights or pointers would be greatly appreciated. Thanks!
0
0
190
1w
is it possible to let siri monitor phone calls, and notify me when a certain trigger happens?
the specific context is that i would like to build an agent that monitors my phone call (with a customer support for example), and simiply identify whether or not im still put on hold, and notify me when im not. currently after reading the doc, i dont think its possible yet, but im so annoyed by the customer support calls that im willing to go the distance and see if theres any way.
Replies
0
Boosts
0
Views
167
Activity
Jun ’25
Detection of balls about 6-10ft Away not detecting
I used Yolo5-11 and while performing great detecting balls lets say 5-10ft away in 1920 resolution and even in 640 it really is taking toll on my app performance. When I use Create ML it outputs all in 415x which is probably the reason why it does not detect objects from far. What can I do to preserve some energy ? My model is used with about 1K pictures 200 each test and validate, and from close up and far.
Replies
0
Boosts
2
Views
247
Activity
Apr ’25
Full documentation of annotations file for Create ML
The documentation for the Create ML tool ("Building an object detector data source") mentions that there are options for using normalized values instead of pixels and also different anchor point origins ("MLBoundingBoxCoordinatesOrigin") instead of always using "center". However, the JSON format for these does not appear in any examples. Does anyone know the format for these options?
Replies
0
Boosts
1
Views
264
Activity
May ’25
MLX/Ollama Benchmarking Suite - Open Source and Free
Hi all, I spent the last few months developing an MLX/Ollama local AI Benchmarking suite for Apple Silicon, written in pure Swift and signed with an Apple Developer Certificate, open source, GPL, and free. I would love some feedback to continue development. It is the only benchmarking suite I know of that supports live power metrics and MLX natively, as well as quick exports for benchmark results, and an arena mode, Model A vs B with history. I really want this project to succeed, and have widespread use, so getting 75 stars on the github repo makes it eligible for Homebrew/Cask distribution. Github Repo
Replies
0
Boosts
0
Views
170
Activity
Feb ’26
How to Ensure Controlled and Contextual Responses Using Foundation Models ?
Hi everyone, I’m currently exploring the use of Foundation models on Apple platforms to build a chatbot-style assistant within an app. While the integration part is straightforward using the new FoundationModel APIs, I’m trying to figure out how to control the assistant’s responses more tightly — particularly: Ensuring the assistant adheres to a specific tone, context, or domain (e.g. hospitality, healthcare, etc.) Preventing hallucinations or unrelated outputs Constraining responses based on app-specific rules, structured data, or recent interactions I’ve experimented with prompt, systemMessage, and few-shot examples to steer outputs, but even with carefully generated prompts, the model occasionally produces incorrect or out-of-scope responses. Additionally, when using multiple tools, I'm unsure how best to structure the setup so the model can select the correct pathway/tool and respond appropriately. Is there a recommended approach to guiding the model's decision-making when several tools or structured contexts are involved? Looking forward to hearing your thoughts or being pointed toward related WWDC sessions, Apple docs, or sample projects.
Replies
0
Boosts
0
Views
136
Activity
Jul ’25
CoreML Model Conversion Help
I’m trying to follow Apple’s “WWDC24: Bring your machine learning and AI models to Apple Silicon” session to convert the Mistral-7B-Instruct-v0.2 model into a Core ML package, but I’ve run into a roadblock that I can’t seem to overcome. I’ve uploaded my full conversion script here for reference: https://pastebin.com/T7Zchzfc When I run the script, it progresses through tracing and MIL conversion but then fails at the backend_mlprogram stage with this error: https://pastebin.com/fUdEzzKM The core of the error is: ValueError: Op "keyCache_tmp" (op_type: identity) Input x="keyCache" expects list, tensor, or scalar but got state[tensor[1,32,8,2048,128,fp16]] I’ve registered my KV-cache buffers in a StatefulMistralWrapper subclass of nn.Module, matching the keyCache and valueCache state names in my ct.StateType definitions, but Core ML’s backend pass reports the state tensor as an invalid input. I’m using Core ML Tools 8.3.0 on Python 3.9.6, targeting iOS18, and forcing CPU conversion (MPS wasn’t available). Any pointers on how to satisfy the handle_unused_inputs pass or properly declare/cache state for GQA models in Core ML would be greatly appreciated! Thanks in advance for your help, Usman Khan
Replies
0
Boosts
0
Views
301
Activity
May ’25
RecognizeDocumentsRequest not detecting paragraphs
I'm trying the new RecognizeDocumentsRequest supposed to detect paragraphs (among other things) in a document. I tried many source images, and I don't see the slightest difference compared to the old API (VN)RecognizedTextRequest Is it supposed to not work or is it in beta?
Replies
0
Boosts
0
Views
330
Activity
Jan ’26
ML contraints & Timeout clarificaitions for Message Filtering Extension
Hello everyone, I’m currently working with the Message Filtering Extension and would really appreciate some clarification around its performance and operational constraints. While the extension is extremely powerful and useful, I’ve found that some important details are either unclear or not well covered in the available documentation. There are two main areas I’m trying to understand better: Machine learning model constraints within the extension In our case, we already have an existing ML model that classifies messages (and are not dependant on Apple's built-in models). We’re evaluating whether and how it can be used inside the extension. Specifically, I’m trying to understand: Are there documented limits on the size of an ML model (e.g., maximum bundle size or model file size in MB)? What are the memory constraints for a model once loaded into memory by the extension? Under what conditions would the system terminate or “kick out” the extension due to memory or performance pressure? Message processing timeouts and execution constraints What is the timeout for processing a single received message? At what point will the OS stop waiting for the extension’s response and allow the message by default (for example, if the extension does not respond in time)? Any guidance, official references, or practical experience from Apple engineers or other developers would be greatly appreciated. Thanks in advance for your help,
Replies
0
Boosts
0
Views
262
Activity
Jan ’26
Apple OCR framework seems to be holding on to allocations every time it is called.
Environment: macOS 26.2 (Tahoe) Xcode 16.3 Apple Silicon (M4) Sandboxed Mac App Store app Description: Repeated use of VNRecognizeTextRequest causes permanent memory growth in the host process. The physical footprint increases by approximately 3-15 MB per OCR call and never returns to baseline, even after all references to the request, handler, observations, and image are released. ` private func selectAndProcessImage() { let panel = NSOpenPanel() panel.allowedContentTypes = [.image] panel.allowsMultipleSelection = false panel.canChooseDirectories = false panel.message = "Select an image for OCR processing" guard panel.runModal() == .OK, let url = panel.url else { return } selectedImageURL = url isProcessing = true recognizedText = "Processing..." // Run OCR on a background thread to keep UI responsive let workItem = DispatchWorkItem { let result = performOCR(on: url) DispatchQueue.main.async { recognizedText = result isProcessing = false } } DispatchQueue.global(qos: .userInitiated).async(execute: workItem) } private func performOCR(on url: URL) -> String { // Wrap EVERYTHING in autoreleasepool so all ObjC objects are drained immediately let resultText: String = autoreleasepool { // Load image and convert to CVPixelBuffer for explicit memory control guard let imageData = try? Data(contentsOf: url) else { return "Error: Could not read image file." } guard let nsImage = NSImage(data: imageData) else { return "Error: Could not create image from file data." } guard let cgImage = nsImage.cgImage(forProposedRect: nil, context: nil, hints: nil) else { return "Error: Could not create CGImage." } let width = cgImage.width let height = cgImage.height // Create a CVPixelBuffer from the CGImage var pixelBuffer: CVPixelBuffer? let attrs: [String: Any] = [ kCVPixelBufferCGImageCompatibilityKey as String: true, kCVPixelBufferCGBitmapContextCompatibilityKey as String: true ] let status = CVPixelBufferCreate( kCFAllocatorDefault, width, height, kCVPixelFormatType_32ARGB, attrs as CFDictionary, &pixelBuffer ) guard status == kCVReturnSuccess, let buffer = pixelBuffer else { return "Error: Could not create CVPixelBuffer (status: \(status))." } // Draw the CGImage into the pixel buffer CVPixelBufferLockBaseAddress(buffer, []) guard let context = CGContext( data: CVPixelBufferGetBaseAddress(buffer), width: width, height: height, bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(buffer), space: CGColorSpaceCreateDeviceRGB(), bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue ) else { CVPixelBufferUnlockBaseAddress(buffer, []) return "Error: Could not create CGContext for pixel buffer." } context.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height)) CVPixelBufferUnlockBaseAddress(buffer, []) // Run OCR let requestHandler = VNImageRequestHandler(cvPixelBuffer: buffer, options: [:]) let request = VNRecognizeTextRequest() request.recognitionLevel = .accurate request.usesLanguageCorrection = true do { try requestHandler.perform([request]) } catch { return "Error during OCR: \(error.localizedDescription)" } guard let observations = request.results, !observations.isEmpty else { return "No text found in image." } let lines = observations.compactMap { observation in observation.topCandidates(1).first?.string } // Explicitly nil out the pixel buffer before the pool drains pixelBuffer = nil return lines.joined(separator: "\n") } // Everything — Data, NSImage, CGImage, CVPixelBuffer, VN objects — released here return resultText } `
Replies
0
Boosts
0
Views
166
Activity
Feb ’26
Is Jax for Apple Silicon is still supported
Hi From https://aninterestingwebsite.com/metal/jax/ I checked all active workflows on https://github.com/jax-ml/jax and any open issues with tags Metal and seems in DEC 2025 the Jax maintainers have closed all issues citing No active development on Jax-metal and the project seems dead. We need to know how can we leverage Apple silicon for accelerated projects using popular academia library and tools . Is the JAX project still going to be supported or Apple has plans to bring something of tis own that might be platform agnostic . Thanks
Replies
0
Boosts
0
Views
160
Activity
Feb ’26
Hardware Support for Low Precision Data Types?
Hi all, I'm trying to find out if/when we can expect mxfp8/mxfp4 support on Apple Silicon. I've noticed that mlx now has casting data types, but all computation is still done in bf16. Would be great to reduce power consumption with support for these lower precision data types since edge inference is already typically done at a lower precision! Thanks in advance.
Replies
0
Boosts
0
Views
316
Activity
Nov ’25
Request: Official One-Click Local LLM Deployment for 2019 Mac Pro (7,1) Dual W6900X
I am a professional user of the 2019 Mac Pro (7,1) with dual AMD Radeon Pro W6900X MPX modules (32GB VRAM each). This hardware is designed for high-performance compute, but it is currently crippled for modern local LLM/AI workloads under Linux due to Apple's EFI/PCIe routing restrictions. Core Issue: rocminfo reports "No HIP GPUs available" when attempting to use ROCm/amdgpu on Linux Apple's custom EFI firmware blocks full initialization of professional GPU compute assets The dual W6900X GPUs have 64GB combined VRAM and high-bandwidth Infinity Fabric Link, but cannot be fully utilized for local AI inference/training My Specific Request: Apple should provide an official, one-click deployable application that enables full utilization of dual W6900X GPUs for local large language model (LLM) inference and training under Linux. This application must: Fully initialize both W6900X GPUs via HIP/ROCm, establishing valid compute contexts Bypass artificial EFI/PCIe routing restrictions that block access to professional GPU resources Provide a stable, user-friendly one-click deployment experience (similar to NVIDIA's AI Enterprise or AMD's ROCm Hub) Why This Matters: The 2019 Mac Pro is Apple's flagship professional workstation, marketed for compute-intensive workloads. Its high-cost W6900X GPUs should not be locked down for modern AI/LLM use cases. An official one-click deployment solution would demonstrate Apple's commitment to professional AI and unlock significant value for professional users. I look forward to Apple's response and a clear roadmap for enabling this critical capability. #MacPro #Linux #ROCm #LocalLLM #W6900X #CoreML
Replies
0
Boosts
0
Views
92
Activity
1w
Core-ml-on-device-llama Converting fails
I followed below url for converting Llama-3.1-8B-Instruct model but always fails even i have 64GB of free space after downloading model from huggingface. https://machinelearning.apple.com/research/core-ml-on-device-llama Also tried with other models Llama-3.1-1B-Instruct & Llama-3.1-3B-Instruct models those are converted but while doing performance test in xcode fails for all compunits. Is there any source code to run llama models in ios app.
Replies
0
Boosts
0
Views
233
Activity
Apr ’25
Various On-Device Frameworks API & ChatGPT
Posting a follow up question after the WWDC 2025 Machine Learning AI & Frameworks Group Lab on June 12. In regards to the on-device API of any of the AI frameworks (foundation model, vision framework, ect.), is there a response condition or path where the API outsources it's input to ChatGPT if the user has allowed this like Siri does? Ignore this if it's a no: is this handled behind the scenes or by the developer?
Replies
0
Boosts
0
Views
324
Activity
Jun ’25
recent JAX versions fail on Metal
Hi, I'm not sure whether this is the appropriate forum for this topic. I just followed a link from the JAX Metal plugin page https://aninterestingwebsite.com/metal/jax/ I'm writing a Python app with JAX, and recent JAX versions fail on Metal. E.g. v0.8.2 I have to downgrade JAX pretty hard to make it work: pip install jax==0.4.35 jaxlib==0.4.35 jax-metal==0.1.1 Can we get an updated release of jax-metal that would fix this issue? Here is the error I get with JAX v0.8.2: WARNING:2025-12-26 09:55:28,117:jax._src.xla_bridge:881: Platform 'METAL' is experimental and not all JAX functionality may be correctly supported! WARNING: All log messages before absl::InitializeLog() is called are written to STDERR W0000 00:00:1766771728.118004 207582 mps_client.cc:510] WARNING: JAX Apple GPU support is experimental and not all JAX functionality is correctly supported! Metal device set to: Apple M3 Max systemMemory: 36.00 GB maxCacheSize: 13.50 GB I0000 00:00:1766771728.129886 207582 service.cc:145] XLA service 0x600001fad300 initialized for platform METAL (this does not guarantee that XLA will be used). Devices: I0000 00:00:1766771728.129893 207582 service.cc:153] StreamExecutor device (0): Metal, <undefined> I0000 00:00:1766771728.130856 207582 mps_client.cc:406] Using Simple allocator. I0000 00:00:1766771728.130864 207582 mps_client.cc:384] XLA backend will use up to 28990554112 bytes on device 0 for SimpleAllocator. Traceback (most recent call last): File "<string>", line 1, in <module> import jax; print(jax.numpy.arange(10)) ~~~~~~~~~~~~~~~~^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/numpy/lax_numpy.py", line 5951, in arange return _arange(start, stop=stop, step=step, dtype=dtype, out_sharding=sharding) File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/numpy/lax_numpy.py", line 6012, in _arange return lax.broadcasted_iota(dtype, (size,), 0, out_sharding=out_sharding) ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/lax/lax.py", line 3415, in broadcasted_iota return iota_p.bind(dtype=dtype, shape=shape, ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^ dimension=dimension, sharding=out_sharding) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/core.py", line 633, in bind return self._true_bind(*args, **params) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/core.py", line 649, in _true_bind return self.bind_with_trace(prev_trace, args, params) ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/core.py", line 661, in bind_with_trace return trace.process_primitive(self, args, params) ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/core.py", line 1210, in process_primitive return primitive.impl(*args, **params) ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/Users/florin/git/FlorinAndrei/star-cluster-simulator/.venv/lib/python3.13/site-packages/jax/_src/dispatch.py", line 91, in apply_primitive outs = fun(*args) jax.errors.JaxRuntimeError: UNKNOWN: -:0:0: error: unknown attribute code: 22 -:0:0: note: in bytecode version 6 produced by: StableHLO_v1.13.0 -------------------- For simplicity, JAX has removed its internal frames from the traceback of the following exception. Set JAX_TRACEBACK_FILTERING=off to include these. I0000 00:00:1766771728.149951 207582 mps_client.h:209] MetalClient destroyed.
Replies
0
Boosts
0
Views
582
Activity
Dec ’25
ANE Performance for on-device Foundation model
I'm running MacOs 26 Beta 5. I noticed that I can no longer achieve 100% usage on the ANE as I could before with Apple Foundations on-device model. Has Apple activated some kind of throttling or power limiting of the ANE? I cannot get above 3w or 40% usage now since upgrading. I'm on the high power energy mode. I there an API rate limit being applied? I kave a M4 Pro mini with 64 GB of memory.
Replies
0
Boosts
0
Views
343
Activity
Aug ’25
Gazetteer encryption?
I have an app that uses a couple of mlmodels (word tagger and gazetteer) and I’m trying to encrypt them before publishing. The models are part of a package. I understand that Xcode can’t automatically handle the encryption for a model in a package the way it can within a traditional app structure. Given that, I’ve generated the Apple MLModel encryption key from Xcode and am encrypting via the command line with: xcrun coremlcompiler compile Gazetteer.mlmodel GazetteerENC.mlmodelc --encrypt Gazetteerkey.mlmodelkey In the package manifest, I’ve listed the encrypted models as .copy resources for my target and have verified the URL to that file is good. When I try to load the encrypted .mlmodelc file (on a physical device) with the line:
 gazetteer = try NLGazetteer(contentsOf: gazetteerURL!) I get the error: Failed to open file: /…/Scanner.bundle/GazetteerENC.mlmodelc/coremldata.bin. It is not a valid .mlmodelc file. So my questions are: Does the NLGazetteer class support encrypted MLModel files? Given that my models are in a package, do I have the right general approach? Thanks for any help or thoughts.
Replies
0
Boosts
0
Views
162
Activity
May ’25
SoundAnalysis built-in classifier fails in background (SNErrorCode.operationFailed)
I’m seeing consistent failures using SoundAnalysis live classification when my app moves to the background. Setup iOS 17.x AVAudioEngine mic capture SNAudioStreamAnalyzer SNClassifySoundRequest(classifierIdentifier: .version1) UIBackgroundModes = audio AVAudioSession .record / .playAndRecord, active Audio capture + level metering continue working in background (mic indicator stays on) Issue As soon as the app enters background / screen locks: SoundAnalysis starts failing every second with domain:com.apple.SoundAnalysis, code:2(SNErrorCode.operationFailed) Audio capture itself continues normally When the app returns to foreground, classification immediately resumes without restarting the engine/analyzer Question Is live background sound classification with the built-in SoundAnalysis classifier officially unsupported or known to fail in background? If so, is a custom Core ML model the only supported approach for background detection? Or is there a required configuration I’m missing to keep SNClassifySoundRequest(.version1) running in background? Thanks for any clarification.
Replies
0
Boosts
1
Views
223
Activity
Dec ’25
How to test for VisualIntelligence available on device?
I'm adding Visual Intelligence support to my app, and now want to add a Tip using TipKit to guide users to this feature from within my app. I want to add a Rule to my Tip which will only show this Tip on devices where Visual Intelligence is supported (ex. not iPhone 14 Pro Max). What is the best way for me to determine availability to set this TipKit rule? Here's the documentation I'm following for Visual Intelligence: https://aninterestingwebsite.com/documentation/visualintelligence/integrating-your-app-with-visual-intelligence
Replies
0
Boosts
0
Views
739
Activity
Sep ’25
How does ARKit achieve low-latency and stable head tracking using only RGB camera ?
Hi, I’m working on a real-time head/face tracking pipeline using a standard 2D RGB camera, and I’m trying to better understand how ARKit achieves such stable and responsive results in comparable conditions. To clarify upfront: I’m specifically interested in RGB-only tracking and the underlying vision/ML pipeline. I’m not using TrueDepth or any depth/IR-based sensors, and I’d like to understand how similar stability and responsiveness can be achieved under those constraints. In my current setup, I estimate head pose from RGB frames (facial landmarks + PnP) and apply temporal filtering (e.g., exponential smoothing and Kalman filtering). This significantly reduces jitter, but introduces noticeable latency, especially during faster head movements. What stands out in ARKit is that it appears to maintain both: Very low jitter Very low perceived latency even when operating with camera input alone. I’m trying to understand what techniques might contribute to this behavior. In particular: Does ARKit use predictive tracking (e.g., velocity or acceleration-based pose extrapolation) to compensate for camera and processing delays in RGB-only scenarios? Are there recommended strategies for balancing temporal smoothing and responsiveness without introducing visible lag in camera-based pose estimation pipelines? Is the tracking pipeline internally decoupled from rendering (e.g., asynchronous processing with prediction applied at render time)? Are there general best practices for minimizing end-to-end latency in vision-based head tracking systems beyond standard filtering approaches? I understand that implementation details may not be public, but any high-level insights or pointers would be greatly appreciated. Thanks!
Replies
0
Boosts
0
Views
190
Activity
1w