Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

Real Time Text detection using iOS18 RecognizeTextRequest from video buffer returns gibberish
Hey Devs, I'm trying to create my own Real Time Text detection like this Apple project. https://aninterestingwebsite.com/documentation/vision/extracting-phone-numbers-from-text-in-images I want to use the new iOS18 RecognizeTextRequest instead of the old VNRecognizeTextRequest in my SwiftUI project. This is my delegate code with the camera setup. I removed region of interest for debugging but I'm trying to scan English words in books. The idea is to get one word in the ROI in the future. But I can't even get proper words so testing without ROI incase my math is wrong. @Observable class CameraManager: NSObject, AVCapturePhotoCaptureDelegate ... override init() { super.init() setUpVisionRequest() } private func setUpVisionRequest() { textRequest = RecognizeTextRequest(.revision3) } ... func setup() -> Bool { captureSession.beginConfiguration() guard let captureDevice = AVCaptureDevice.default( .builtInWideAngleCamera, for: .video, position: .back) else { return false } self.captureDevice = captureDevice guard let deviceInput = try? AVCaptureDeviceInput(device: captureDevice) else { return false } /// Check whether the session can add input. guard captureSession.canAddInput(deviceInput) else { print("Unable to add device input to the capture session.") return false } /// Add the input and output to session captureSession.addInput(deviceInput) /// Configure the video data output videoDataOutput.setSampleBufferDelegate( self, queue: videoDataOutputQueue) if captureSession.canAddOutput(videoDataOutput) { captureSession.addOutput(videoDataOutput) videoDataOutput.connection(with: .video)? .preferredVideoStabilizationMode = .off } else { return false } // Set zoom and autofocus to help focus on very small text do { try captureDevice.lockForConfiguration() captureDevice.videoZoomFactor = 2 captureDevice.autoFocusRangeRestriction = .near captureDevice.unlockForConfiguration() } catch { print("Could not set zoom level due to error: \(error)") return false } captureSession.commitConfiguration() // potential issue with background vs dispatchqueue ?? Task(priority: .background) { captureSession.startRunning() } return true } } // Issue here ??? extension CameraManager: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput( _ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection ) { guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } Task { textRequest.recognitionLevel = .fast textRequest.recognitionLanguages = [Locale.Language(identifier: "en-US")] do { let observations = try await textRequest.perform(on: pixelBuffer) for observation in observations { let recognizedText = observation.topCandidates(1).first print("recognized text \(recognizedText)") } } catch { print("Recognition error: \(error.localizedDescription)") } } } } The results I get look like this ( full page of English from a any book) recognized text Optional(RecognizedText(string: e bnUI W4, confidence: 0.5)) recognized text Optional(RecognizedText(string: ?'U, confidence: 0.3)) recognized text Optional(RecognizedText(string: traQt4, confidence: 0.3)) recognized text Optional(RecognizedText(string: li, confidence: 0.3)) recognized text Optional(RecognizedText(string: 15,1,#, confidence: 0.3)) recognized text Optional(RecognizedText(string: jllÈ, confidence: 0.3)) recognized text Optional(RecognizedText(string: vtrll, confidence: 0.3)) recognized text Optional(RecognizedText(string: 5,1,: 11, confidence: 0.5)) recognized text Optional(RecognizedText(string: 1141, confidence: 0.3)) recognized text Optional(RecognizedText(string: jllll ljiiilij41, confidence: 0.3)) recognized text Optional(RecognizedText(string: 2f4, confidence: 0.3)) recognized text Optional(RecognizedText(string: ktril, confidence: 0.3)) recognized text Optional(RecognizedText(string: ¥LLI, confidence: 0.3)) recognized text Optional(RecognizedText(string: 11[Itl,, confidence: 0.3)) recognized text Optional(RecognizedText(string: 'rtlÈ131, confidence: 0.3)) Even with ROI set to a specific rectangle Normalized to Vision, I get the same results with single characters returning gibberish. Any help would be amazing thank you. Am I using the buffer right ? Am I using the new perform(on: CVPixelBuffer) right ? Maybe I didn't set up my camera properly? I can provide code
1
0
365
Jul ’25
Converting TF2 object detection to CoreML
I've spent way too long today trying to convert an Object Detection TensorFlow2 model to a CoreML object classifier (with bounding boxes, labels and probability score) The 'SSD MobileNet v2 320x320' is here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md And I've been following all sorts of posts and ChatGPT https://apple.github.io/coremltools/docs-guides/source/tensorflow-2.html#convert-a-tensorflow-concrete-function https://aninterestingwebsite.com/videos/play/wwdc2020/10153/?time=402 To convert it. I keep hitting the same errors though, mostly around: NotImplementedError: Expected model format: [SavedModel | concrete_function | tf.keras.Model | .h5 | GraphDef], got <ConcreteFunction signature_wrapper(input_tensor) at 0x366B87790> I've had varying success including missing output labels/predictions. But I simply want to create the CoreML model with all the right inputs and outputs (including correct names) as detailed in the docs here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_mobile_tf2.md It goes without saying I don't have much (any) experience with this stuff including Python so the whole thing's been a bit of a headache. If anyone is able to help that would be great. FWIW I'm not attached to any one specific model, but what I do need at minimum is a CoreML model that can detect objects (has to at least include lights and lamps) within a live video image, detecting where in the image the object is. The simplest script I have looks like this: import coremltools as ct import tensorflow as tf model = tf.saved_model.load("~/tf_models/ssd_mobilenet_v2_320x320_coco17_tpu-8/saved_model") concrete_func = model.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY] mlmodel = ct.convert( concrete_func, source="tensorflow", inputs=[ct.TensorType(shape=(1, 320, 320, 3))] ) mlmodel.save("YourModel.mlpackage", save_format="mlpackage")
1
0
504
Jul ’25
get error with xcode beta3 :decodingFailure(FoundationModels.LanguageModelSession.GenerationError.Context
@Generable enum Breakfast { case waffles case pancakes case bagels case eggs } do { let session = LanguageModelSession() let userInput = "I want something sweet." let prompt = "Pick the ideal breakfast for request: (userInput)" let response = try await session.respond(to: prompt,generating: Breakfast.self) print(response.content) } catch let error { print(error) } i want to test the @Generable demo but get error with below:decodingFailure(FoundationModels.LanguageModelSession.GenerationError.Context(debugDescription: "Failed to convert text into into GeneratedContent\nText: waffles", underlyingErrors: [Swift.DecodingError.dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid JSON.", underlyingError: Optional(Error Domain=NSCocoaErrorDomain Code=3840 "Unexpected character 'w' around line 1, column 1." UserInfo={NSJSONSerializationErrorIndex=0, NSDebugDescription=Unexpected character 'w' around line 1, column 1.})))]))
1
0
138
Jul ’25
Xcode 26.1 RC ( RC1 ?) Apple Intelligence using GPT (with account or without) or Sonnet (via OpenRouter) much slower
I didn't run benchmarks before update, but it seems at least 5x slower. Of course all the LLM work is on remote servers, so is non-intuitive to me this should be happening. Had updated MacOS and Xcode to 26.1RC at the same time, so can't even say I think it is MacOS or I think it is Xcode. Before the update the progress indicator for each piece of code might seem to get stuck at the very end (and toggling between Navigators and Coding Assistant) in Xcode UI seemed to refresh the UI and confirm coding complete... but now it seems progress races to 50%, then often is stuck at 75%... well earlier than used to get stuck. And it like something is legitimately processing not just a UI glitch. I'm wondering if this is somehow tied to visual rendering of the code in the little white window? CMD-TAB into Xcode seems laggy. Xcode is pinning a CPU. Why, this is all remote LLM work? MacBook Pro 2021 M1 64GB RAM. Went from 26.01 to 26.1RC. Didn't touch any of the betas until RC1.
1
1
343
Oct ’25
What's the best way to load adapters to try?
I'm new to Swift and was hoping the Playground would support loading adaptors. When I tried, I got a permissions error - thinking it's because it's not in the project and Playgrounds don't like going outside the project? A tutorial and some sample code would be helpful. Also some benchmarks on how long it's expected to take. Selfishly I'm on an M2 Mac Mini.
1
0
305
Jul ’25
Is it possible to create a virtual NPU device on macOS using Hypervisor.framework + CoreML?
Is it possible to expose a custom VirtIO device to a Linux guest running inside a VM — likely using QEMU backed by Hypervisor.framework. The guest would see this device as something like /dev/npu0, and it would use a kernel driver + userspace library to submit inference requests. On the macOS host, these requests would be executed using CoreML, MPSGraph, or BNNS. The results would be passed back to the guest via IPC. Does the macOS allow this kind of "fake" NPU / GPU
1
0
447
Aug ’25
Pre-inference AI Safety Governor for FoundationModels (Swift, On-Device)
Greetings, and Happy Holidays, I've been building an on-device AI safety layer called Newton Engine, designed to validate prompts before they reach FoundationModels (or any LLM). Wanted to share v1.3 and get feedback from the community. The Problem Current AI safety is post-training — baked into the model, probabilistic, not auditable. When Apple Intelligence ships with FoundationModels, developers will need a way to catch unsafe prompts before inference, with deterministic results they can log and explain. What Newton Does Newton validates every prompt pre-inference and returns: Phase (0/1/7/8/9) Shape classification Confidence score Full audit trace If validation fails, generation is blocked. If it passes (Phase 9), the prompt proceeds to the model. v1.3 Detection Categories (14 total) Jailbreak / prompt injection Corrosive self-negation ("I hate myself") Hedged corrosive ("Not saying I'm worthless, but...") Emotional dependency ("You're the only one who understands") Third-person manipulation ("If you refuse, you're proving nobody cares") Logical contradictions ("Prove truth doesn't exist") Self-referential paradox ("Prove that proof is impossible") Semantic inversion ("Explain how truth can be false") Definitional impossibility ("Square circle") Delegated agency ("Decide for me") Hallucination-risk prompts ("Cite the 2025 CDC report") Unbounded recursion ("Repeat forever") Conditional unbounded ("Until you can't") Nonsense / low semantic density Test Results 94.3% catch rate on 35 adversarial test cases (33/35 passed). Architecture User Input ↓ [ Newton ] → Validates prompt, assigns Phase ↓ Phase 9? → [ FoundationModels ] → Response Phase 1/7/8? → Blocked with explanation Key Properties Deterministic (same input → same output) Fully auditable (ValidationTrace on every prompt) On-device (no network required) Native Swift / SwiftUI String Catalog localization (EN/ES/FR) FoundationModels-ready (#if canImport) Code Sample — Validation let governor = NewtonGovernor() let result = governor.validate(prompt: userInput) if result.permitted { // Proceed to FoundationModels let session = LanguageModelSession() let response = try await session.respond(to: userInput) } else { // Handle block print("Blocked: Phase \(result.phase.rawValue) — \(result.reasoning)") print(result.trace.summary) // Full audit trace } Questions for the Community Anyone else building pre-inference validation for FoundationModels? Thoughts on the Phase system (0/1/7/8/9) vs. simple pass/fail? Interest in Shape Theory classification for prompt complexity? Best practices for integrating with LanguageModelSession? Links GitHub: https://github.com/jaredlewiswechs/ada-newton Technical overview: parcri.net Happy to share more implementation details. Looking for feedback, collaborators, and anyone else thinking about deterministic AI safety on-device. parcri.net has the link :)
1
0
521
Dec ’25
KV-Cache MLState Not Updating During Prefill Stage in Core ML LLM Inference
Hello, I'm running a large language model (LLM) in Core ML that uses a key-value cache (KV-cache) to store past attention states. The model was converted from PyTorch using coremltools and deployed on-device with Swift. The KV-cache is exposed via MLState and is used across inference steps for efficient autoregressive generation. During the prefill stage — where a prompt of multiple tokens is passed to the model in a single batch to initialize the KV-cache — I’ve noticed that some entries in the KV-cache are not updated after the inference. Specifically: Here are a few details about the setup: The MLState returned by the model is identical to the input state (often empty or zero-initialized) for some tokens in the batch. The issue only happens during the prefill stage (i.e., first call over multiple tokens). During decoding (single-token generation), the KV-cache updates normally. The model is invoked using MLModel.prediction(from:using:options:) for each batch. I’ve confirmed: The prompt tokens are non-repetitive and not masked. The model spec has MLState inputs/outputs correctly configured for KV-cache tensors. Each token is processed in a loop with the correct positional encodings. Questions: Is there any known behavior in Core ML that could prevent MLState from updating during batched or prefill inference? Could this be caused by internal optimizations such as lazy execution, static masking, or zero-value short-circuiting? How can I confirm that each token in the batch is contributing to the KV-cache during prefill? Any insights from the Core ML or LLM deployment community would be much appreciated.
1
0
279
May ’25
LLM size for fine-tuning using MLX in MacBook
Hi, recently i tried to fine-tune Gemma-2-2b mlx model on my macbook (24 GB UMA). The code started running, after few seconds i saw swap size reaching 50GB and ram around 23 GB and then it stopped. I ran the Gemma-2-2b (cuda) on colab, it ran and occupied 27 GB on A100 gpu and worked fine. Here i didn't experienced swap issue. Now my question is if my UMA was more than 27 GB, i also would not have experienced swap disk issue. Thanks.
1
0
385
Oct ’25
Is there an API for the 3D effect from flat photos?
Introduced in the Keynote was the 3D Lock Screen images with the kangaroo: https://9to5mac.com/wp-content/uploads/sites/6/2025/06/3d-lock-screen-2.gif I can't see any mention on if this effect is available for developers with an API to convert flat 2D photos in to the same 3D feeling image. Does anyone know if there is an API?
1
1
107
Jun ’25
Downloading my fine tuned model from huggingface
I have used mlx_lm.lora to fine tune a mistral-7b-v0.3-4bit model with my data. I fused the mistral model with my adapters and upload the fused model to my directory on huggingface. I was able to use mlx_lm.generate to use the fused model in Terminal. However, I don't know how to load the model in Swift. I've used Imports import SwiftUI import MLX import MLXLMCommon import MLXLLM let modelFactory = LLMModelFactory.shared let configuration = ModelConfiguration( id: "pharmpk/pk-mistral-7b-v0.3-4bit" ) // Load the model off the main actor, then assign on the main actor let loaded = try await modelFactory.loadContainer(configuration: configuration) { progress in print("Downloading progress: \(progress.fractionCompleted * 100)%") } await MainActor.run { self.model = loaded } I'm getting an error runModel error: downloadError("A server with the specified hostname could not be found.") Any suggestions? Thanks, David PS, I can load the model from the app bundle // directory: Bundle.main.resourceURL! but it's too big to upload for Testflight
1
0
558
Oct ’25
Unable to use FoundationModels in older app?
Hi, I'm trying to add FoundationModels to an older project but always get the following error: "Unable to resolve 'dependency' 'FoundationModels' import FoundationModels" The error comes and goes while its compiling and then doesn't run the app. I have my target set to 26.0 (and can't go any higher) and am using Xcode 26 (17E192). Is anyone else having this issue? Thanks, Dan Uff
1
0
131
1w
Handling exceedingContextWindowSizeError
Reading all the docs(1) I was under the impression that handling this error is well managed... Until I hit it and found out that the recommended handling options hide a crucial fact: in the catch block you can not do anything?! It's too late - everything is lost, no way to recover... All the docs mislead me that I can apply the Transcript trick in the catch block until I realised, that there is nothing there !!! This article here(2) enlightened me on the handling of this problem, but I must say (and the author as well) - this is a hack! So my questions: is there really no way to handle this exception properly? if not, can we have the most important information - the count of the context exposed through the official API (at least the known ones)? https://aninterestingwebsite.com/documentation/Technotes/tn3193-managing-the-on-device-foundation-model-s-context-window#Handle-the-exceeding-context-window-size-error-elegantly https://zats.io/blog/making-the-most-of-apple-foundation-models-context-window/
1
0
159
Mar ’26
Inference Provider crashed with 2:5
I am trying to create a slightly different version of the content tagging code in the documentation: https://aninterestingwebsite.com/documentation/foundationmodels/systemlanguagemodel/usecase/contenttagging In the playground I am getting an "Inference Provider crashed with 2:5" error. I have no idea what that means or how to address the error. Any assistance would be appreciated.
1
0
544
Jul ’25
MPS SDPA Attention Kernel Regression on A14-class (M1) in macOS 26.3.1 — Works on A15+ (M2+)
Summary Since macOS 26, our Core ML / MPS inference pipeline produces incorrect results on Mac mini M1 (Macmini9,1, A14-class SoC). The same model and code runs correctly on M2 and newer (A15-class and up). The regression appears to be in the Scaled Dot-Product Attention (SDPA) kernel path in the MPS backend. Environment Affected Mac mini M1 — Macmini9,1 (A14-class) Not affected M2 and newer (A15-class and up) Last known good macOS Sequoia First broken macOS 26 (Tahoe) ? Confirmed broken on macOS 26.3.1 Framework Core ML + MPS backend Language C++ (via CoreML C++ API) Description We ship an audio processing application (VoiceAssist by NoiseWorks) that runs a deep learning model (based on Demucs architecture) via Core ML with the MPS compute unit. On macOS Sequoia this works correctly on all Apple Silicon Macs including M1. After updating to macOS 26 (Tahoe), inference on M1 Macs fails — either producing garbage output or crashing. The same binary, same .mlpackage, same inputs work correctly on M2+. Our Apple contact has suggested the root cause is a regression in the A14-specific MPS SDPA attention kernel, which may have broken when the Metal/MPS stack was updated in macOS 26. The model makes heavy use of attention layers, and the failure correlates precisely with the SDPA path being exercised on A14 hardware. Steps to Reproduce Load a Core ML model that uses Scaled Dot-Product Attention (e.g. a transformer or attention-based audio model) Run inference with MLComputeUnits::cpuAndGPU (MPS active) Run on Mac mini M1 (Macmini9,1) with macOS 26.3.1 Compare output to the same model running on M2 / macOS Sequoia Expected: Correct inference output, consistent with M2+ and macOS Sequoia behavior Actual: Incorrect / corrupted output (or crash), only on A14-class hardware running macOS 26+ Workaround Forcing MLComputeUnits::cpuOnly bypasses MPS entirely and produces correct output on M1, confirming the issue is in the MPS compute path. This is not acceptable as a shipping workaround due to performance impact. Additional Notes The failure is hardware-specific (A14 only) and OS-specific (macOS 26+), pointing to a kernel-level regression rather than a model or app bug We first became aware of this through a customer report Happy to provide a symbolicated crash log if helpful this text was summarized by AI and human verified
1
0
201
2w
no tensorflow-metal past tf 2.18?
Hi We're on tensorflow 2.20 that has support now for python 3.13 (finally!). tensorflow-metal is still only supporting 2.18 which is over a year old. When can we expect to see support in tensorflow-metal for tf 2.20 (or later!) ? I bought a mac thinking I would be able to get great performance from the M processors but here I am using my CPU for my ML projects. If it's taking so long to release it, why not open source it so the community can keep it more up to date? cheers Matt
1
1
446
Nov ’25
Real Time Text detection using iOS18 RecognizeTextRequest from video buffer returns gibberish
Hey Devs, I'm trying to create my own Real Time Text detection like this Apple project. https://aninterestingwebsite.com/documentation/vision/extracting-phone-numbers-from-text-in-images I want to use the new iOS18 RecognizeTextRequest instead of the old VNRecognizeTextRequest in my SwiftUI project. This is my delegate code with the camera setup. I removed region of interest for debugging but I'm trying to scan English words in books. The idea is to get one word in the ROI in the future. But I can't even get proper words so testing without ROI incase my math is wrong. @Observable class CameraManager: NSObject, AVCapturePhotoCaptureDelegate ... override init() { super.init() setUpVisionRequest() } private func setUpVisionRequest() { textRequest = RecognizeTextRequest(.revision3) } ... func setup() -> Bool { captureSession.beginConfiguration() guard let captureDevice = AVCaptureDevice.default( .builtInWideAngleCamera, for: .video, position: .back) else { return false } self.captureDevice = captureDevice guard let deviceInput = try? AVCaptureDeviceInput(device: captureDevice) else { return false } /// Check whether the session can add input. guard captureSession.canAddInput(deviceInput) else { print("Unable to add device input to the capture session.") return false } /// Add the input and output to session captureSession.addInput(deviceInput) /// Configure the video data output videoDataOutput.setSampleBufferDelegate( self, queue: videoDataOutputQueue) if captureSession.canAddOutput(videoDataOutput) { captureSession.addOutput(videoDataOutput) videoDataOutput.connection(with: .video)? .preferredVideoStabilizationMode = .off } else { return false } // Set zoom and autofocus to help focus on very small text do { try captureDevice.lockForConfiguration() captureDevice.videoZoomFactor = 2 captureDevice.autoFocusRangeRestriction = .near captureDevice.unlockForConfiguration() } catch { print("Could not set zoom level due to error: \(error)") return false } captureSession.commitConfiguration() // potential issue with background vs dispatchqueue ?? Task(priority: .background) { captureSession.startRunning() } return true } } // Issue here ??? extension CameraManager: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput( _ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection ) { guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } Task { textRequest.recognitionLevel = .fast textRequest.recognitionLanguages = [Locale.Language(identifier: "en-US")] do { let observations = try await textRequest.perform(on: pixelBuffer) for observation in observations { let recognizedText = observation.topCandidates(1).first print("recognized text \(recognizedText)") } } catch { print("Recognition error: \(error.localizedDescription)") } } } } The results I get look like this ( full page of English from a any book) recognized text Optional(RecognizedText(string: e bnUI W4, confidence: 0.5)) recognized text Optional(RecognizedText(string: ?'U, confidence: 0.3)) recognized text Optional(RecognizedText(string: traQt4, confidence: 0.3)) recognized text Optional(RecognizedText(string: li, confidence: 0.3)) recognized text Optional(RecognizedText(string: 15,1,#, confidence: 0.3)) recognized text Optional(RecognizedText(string: jllÈ, confidence: 0.3)) recognized text Optional(RecognizedText(string: vtrll, confidence: 0.3)) recognized text Optional(RecognizedText(string: 5,1,: 11, confidence: 0.5)) recognized text Optional(RecognizedText(string: 1141, confidence: 0.3)) recognized text Optional(RecognizedText(string: jllll ljiiilij41, confidence: 0.3)) recognized text Optional(RecognizedText(string: 2f4, confidence: 0.3)) recognized text Optional(RecognizedText(string: ktril, confidence: 0.3)) recognized text Optional(RecognizedText(string: ¥LLI, confidence: 0.3)) recognized text Optional(RecognizedText(string: 11[Itl,, confidence: 0.3)) recognized text Optional(RecognizedText(string: 'rtlÈ131, confidence: 0.3)) Even with ROI set to a specific rectangle Normalized to Vision, I get the same results with single characters returning gibberish. Any help would be amazing thank you. Am I using the buffer right ? Am I using the new perform(on: CVPixelBuffer) right ? Maybe I didn't set up my camera properly? I can provide code
Replies
1
Boosts
0
Views
365
Activity
Jul ’25
Converting TF2 object detection to CoreML
I've spent way too long today trying to convert an Object Detection TensorFlow2 model to a CoreML object classifier (with bounding boxes, labels and probability score) The 'SSD MobileNet v2 320x320' is here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md And I've been following all sorts of posts and ChatGPT https://apple.github.io/coremltools/docs-guides/source/tensorflow-2.html#convert-a-tensorflow-concrete-function https://aninterestingwebsite.com/videos/play/wwdc2020/10153/?time=402 To convert it. I keep hitting the same errors though, mostly around: NotImplementedError: Expected model format: [SavedModel | concrete_function | tf.keras.Model | .h5 | GraphDef], got <ConcreteFunction signature_wrapper(input_tensor) at 0x366B87790> I've had varying success including missing output labels/predictions. But I simply want to create the CoreML model with all the right inputs and outputs (including correct names) as detailed in the docs here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_mobile_tf2.md It goes without saying I don't have much (any) experience with this stuff including Python so the whole thing's been a bit of a headache. If anyone is able to help that would be great. FWIW I'm not attached to any one specific model, but what I do need at minimum is a CoreML model that can detect objects (has to at least include lights and lamps) within a live video image, detecting where in the image the object is. The simplest script I have looks like this: import coremltools as ct import tensorflow as tf model = tf.saved_model.load("~/tf_models/ssd_mobilenet_v2_320x320_coco17_tpu-8/saved_model") concrete_func = model.signatures[tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY] mlmodel = ct.convert( concrete_func, source="tensorflow", inputs=[ct.TensorType(shape=(1, 320, 320, 3))] ) mlmodel.save("YourModel.mlpackage", save_format="mlpackage")
Replies
1
Boosts
0
Views
504
Activity
Jul ’25
get error with xcode beta3 :decodingFailure(FoundationModels.LanguageModelSession.GenerationError.Context
@Generable enum Breakfast { case waffles case pancakes case bagels case eggs } do { let session = LanguageModelSession() let userInput = "I want something sweet." let prompt = "Pick the ideal breakfast for request: (userInput)" let response = try await session.respond(to: prompt,generating: Breakfast.self) print(response.content) } catch let error { print(error) } i want to test the @Generable demo but get error with below:decodingFailure(FoundationModels.LanguageModelSession.GenerationError.Context(debugDescription: "Failed to convert text into into GeneratedContent\nText: waffles", underlyingErrors: [Swift.DecodingError.dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid JSON.", underlyingError: Optional(Error Domain=NSCocoaErrorDomain Code=3840 "Unexpected character 'w' around line 1, column 1." UserInfo={NSJSONSerializationErrorIndex=0, NSDebugDescription=Unexpected character 'w' around line 1, column 1.})))]))
Replies
1
Boosts
0
Views
138
Activity
Jul ’25
Xcode 26.1 RC ( RC1 ?) Apple Intelligence using GPT (with account or without) or Sonnet (via OpenRouter) much slower
I didn't run benchmarks before update, but it seems at least 5x slower. Of course all the LLM work is on remote servers, so is non-intuitive to me this should be happening. Had updated MacOS and Xcode to 26.1RC at the same time, so can't even say I think it is MacOS or I think it is Xcode. Before the update the progress indicator for each piece of code might seem to get stuck at the very end (and toggling between Navigators and Coding Assistant) in Xcode UI seemed to refresh the UI and confirm coding complete... but now it seems progress races to 50%, then often is stuck at 75%... well earlier than used to get stuck. And it like something is legitimately processing not just a UI glitch. I'm wondering if this is somehow tied to visual rendering of the code in the little white window? CMD-TAB into Xcode seems laggy. Xcode is pinning a CPU. Why, this is all remote LLM work? MacBook Pro 2021 M1 64GB RAM. Went from 26.01 to 26.1RC. Didn't touch any of the betas until RC1.
Replies
1
Boosts
1
Views
343
Activity
Oct ’25
What's the best way to load adapters to try?
I'm new to Swift and was hoping the Playground would support loading adaptors. When I tried, I got a permissions error - thinking it's because it's not in the project and Playgrounds don't like going outside the project? A tutorial and some sample code would be helpful. Also some benchmarks on how long it's expected to take. Selfishly I'm on an M2 Mac Mini.
Replies
1
Boosts
0
Views
305
Activity
Jul ’25
Is it possible to create a virtual NPU device on macOS using Hypervisor.framework + CoreML?
Is it possible to expose a custom VirtIO device to a Linux guest running inside a VM — likely using QEMU backed by Hypervisor.framework. The guest would see this device as something like /dev/npu0, and it would use a kernel driver + userspace library to submit inference requests. On the macOS host, these requests would be executed using CoreML, MPSGraph, or BNNS. The results would be passed back to the guest via IPC. Does the macOS allow this kind of "fake" NPU / GPU
Replies
1
Boosts
0
Views
447
Activity
Aug ’25
Pre-inference AI Safety Governor for FoundationModels (Swift, On-Device)
Greetings, and Happy Holidays, I've been building an on-device AI safety layer called Newton Engine, designed to validate prompts before they reach FoundationModels (or any LLM). Wanted to share v1.3 and get feedback from the community. The Problem Current AI safety is post-training — baked into the model, probabilistic, not auditable. When Apple Intelligence ships with FoundationModels, developers will need a way to catch unsafe prompts before inference, with deterministic results they can log and explain. What Newton Does Newton validates every prompt pre-inference and returns: Phase (0/1/7/8/9) Shape classification Confidence score Full audit trace If validation fails, generation is blocked. If it passes (Phase 9), the prompt proceeds to the model. v1.3 Detection Categories (14 total) Jailbreak / prompt injection Corrosive self-negation ("I hate myself") Hedged corrosive ("Not saying I'm worthless, but...") Emotional dependency ("You're the only one who understands") Third-person manipulation ("If you refuse, you're proving nobody cares") Logical contradictions ("Prove truth doesn't exist") Self-referential paradox ("Prove that proof is impossible") Semantic inversion ("Explain how truth can be false") Definitional impossibility ("Square circle") Delegated agency ("Decide for me") Hallucination-risk prompts ("Cite the 2025 CDC report") Unbounded recursion ("Repeat forever") Conditional unbounded ("Until you can't") Nonsense / low semantic density Test Results 94.3% catch rate on 35 adversarial test cases (33/35 passed). Architecture User Input ↓ [ Newton ] → Validates prompt, assigns Phase ↓ Phase 9? → [ FoundationModels ] → Response Phase 1/7/8? → Blocked with explanation Key Properties Deterministic (same input → same output) Fully auditable (ValidationTrace on every prompt) On-device (no network required) Native Swift / SwiftUI String Catalog localization (EN/ES/FR) FoundationModels-ready (#if canImport) Code Sample — Validation let governor = NewtonGovernor() let result = governor.validate(prompt: userInput) if result.permitted { // Proceed to FoundationModels let session = LanguageModelSession() let response = try await session.respond(to: userInput) } else { // Handle block print("Blocked: Phase \(result.phase.rawValue) — \(result.reasoning)") print(result.trace.summary) // Full audit trace } Questions for the Community Anyone else building pre-inference validation for FoundationModels? Thoughts on the Phase system (0/1/7/8/9) vs. simple pass/fail? Interest in Shape Theory classification for prompt complexity? Best practices for integrating with LanguageModelSession? Links GitHub: https://github.com/jaredlewiswechs/ada-newton Technical overview: parcri.net Happy to share more implementation details. Looking for feedback, collaborators, and anyone else thinking about deterministic AI safety on-device. parcri.net has the link :)
Replies
1
Boosts
0
Views
521
Activity
Dec ’25
KV-Cache MLState Not Updating During Prefill Stage in Core ML LLM Inference
Hello, I'm running a large language model (LLM) in Core ML that uses a key-value cache (KV-cache) to store past attention states. The model was converted from PyTorch using coremltools and deployed on-device with Swift. The KV-cache is exposed via MLState and is used across inference steps for efficient autoregressive generation. During the prefill stage — where a prompt of multiple tokens is passed to the model in a single batch to initialize the KV-cache — I’ve noticed that some entries in the KV-cache are not updated after the inference. Specifically: Here are a few details about the setup: The MLState returned by the model is identical to the input state (often empty or zero-initialized) for some tokens in the batch. The issue only happens during the prefill stage (i.e., first call over multiple tokens). During decoding (single-token generation), the KV-cache updates normally. The model is invoked using MLModel.prediction(from:using:options:) for each batch. I’ve confirmed: The prompt tokens are non-repetitive and not masked. The model spec has MLState inputs/outputs correctly configured for KV-cache tensors. Each token is processed in a loop with the correct positional encodings. Questions: Is there any known behavior in Core ML that could prevent MLState from updating during batched or prefill inference? Could this be caused by internal optimizations such as lazy execution, static masking, or zero-value short-circuiting? How can I confirm that each token in the batch is contributing to the KV-cache during prefill? Any insights from the Core ML or LLM deployment community would be much appreciated.
Replies
1
Boosts
0
Views
279
Activity
May ’25
LLM size for fine-tuning using MLX in MacBook
Hi, recently i tried to fine-tune Gemma-2-2b mlx model on my macbook (24 GB UMA). The code started running, after few seconds i saw swap size reaching 50GB and ram around 23 GB and then it stopped. I ran the Gemma-2-2b (cuda) on colab, it ran and occupied 27 GB on A100 gpu and worked fine. Here i didn't experienced swap issue. Now my question is if my UMA was more than 27 GB, i also would not have experienced swap disk issue. Thanks.
Replies
1
Boosts
0
Views
385
Activity
Oct ’25
face and body detection is local model or a cloud model?
Is the face and body detection service in the Vision framework a local model or a cloud model? https://aninterestingwebsite.com/documentation/vision
Replies
1
Boosts
0
Views
746
Activity
Sep ’25
Is there an API for the 3D effect from flat photos?
Introduced in the Keynote was the 3D Lock Screen images with the kangaroo: https://9to5mac.com/wp-content/uploads/sites/6/2025/06/3d-lock-screen-2.gif I can't see any mention on if this effect is available for developers with an API to convert flat 2D photos in to the same 3D feeling image. Does anyone know if there is an API?
Replies
1
Boosts
1
Views
107
Activity
Jun ’25
FoundationModelsTripPlanner sample not working?
I installed Xcode 26.0 beta and downloaded the generative models sample from here: https://aninterestingwebsite.com/documentation/foundationmodels/adding-intelligent-app-features-with-generative-models But when I run it in the iOS 26.0 simulator, I get the error shown here. What's going wrong?
Replies
1
Boosts
0
Views
313
Activity
Jun ’25
Downloading my fine tuned model from huggingface
I have used mlx_lm.lora to fine tune a mistral-7b-v0.3-4bit model with my data. I fused the mistral model with my adapters and upload the fused model to my directory on huggingface. I was able to use mlx_lm.generate to use the fused model in Terminal. However, I don't know how to load the model in Swift. I've used Imports import SwiftUI import MLX import MLXLMCommon import MLXLLM let modelFactory = LLMModelFactory.shared let configuration = ModelConfiguration( id: "pharmpk/pk-mistral-7b-v0.3-4bit" ) // Load the model off the main actor, then assign on the main actor let loaded = try await modelFactory.loadContainer(configuration: configuration) { progress in print("Downloading progress: \(progress.fractionCompleted * 100)%") } await MainActor.run { self.model = loaded } I'm getting an error runModel error: downloadError("A server with the specified hostname could not be found.") Any suggestions? Thanks, David PS, I can load the model from the app bundle // directory: Bundle.main.resourceURL! but it's too big to upload for Testflight
Replies
1
Boosts
0
Views
558
Activity
Oct ’25
Unable to use FoundationModels in older app?
Hi, I'm trying to add FoundationModels to an older project but always get the following error: "Unable to resolve 'dependency' 'FoundationModels' import FoundationModels" The error comes and goes while its compiling and then doesn't run the app. I have my target set to 26.0 (and can't go any higher) and am using Xcode 26 (17E192). Is anyone else having this issue? Thanks, Dan Uff
Replies
1
Boosts
0
Views
131
Activity
1w
Handling exceedingContextWindowSizeError
Reading all the docs(1) I was under the impression that handling this error is well managed... Until I hit it and found out that the recommended handling options hide a crucial fact: in the catch block you can not do anything?! It's too late - everything is lost, no way to recover... All the docs mislead me that I can apply the Transcript trick in the catch block until I realised, that there is nothing there !!! This article here(2) enlightened me on the handling of this problem, but I must say (and the author as well) - this is a hack! So my questions: is there really no way to handle this exception properly? if not, can we have the most important information - the count of the context exposed through the official API (at least the known ones)? https://aninterestingwebsite.com/documentation/Technotes/tn3193-managing-the-on-device-foundation-model-s-context-window#Handle-the-exceeding-context-window-size-error-elegantly https://zats.io/blog/making-the-most-of-apple-foundation-models-context-window/
Replies
1
Boosts
0
Views
159
Activity
Mar ’26
Inference Provider crashed with 2:5
I am trying to create a slightly different version of the content tagging code in the documentation: https://aninterestingwebsite.com/documentation/foundationmodels/systemlanguagemodel/usecase/contenttagging In the playground I am getting an "Inference Provider crashed with 2:5" error. I have no idea what that means or how to address the error. Any assistance would be appreciated.
Replies
1
Boosts
0
Views
544
Activity
Jul ’25
Supported regex patterns for generation guide
Hey Tried using a few regular expressions and all fail with an error: Unhandled error streaming response: A generation guide with an unsupported pattern was used. Is there are a list of supported features? I don't see it in docs, and it takes RegExp. Anything with e.g. [A-Z] fails.
Replies
1
Boosts
0
Views
151
Activity
Jul ’25
Download the Foundation Models Adaptor Training Toolkit
Download the Foundation Models Adaptor Training Toolkit Hi, after I clicked on the download button, I was redirected to this page https://aninterestingwebsite.com and did not download the toolkit.
Replies
1
Boosts
0
Views
479
Activity
Jul ’25
MPS SDPA Attention Kernel Regression on A14-class (M1) in macOS 26.3.1 — Works on A15+ (M2+)
Summary Since macOS 26, our Core ML / MPS inference pipeline produces incorrect results on Mac mini M1 (Macmini9,1, A14-class SoC). The same model and code runs correctly on M2 and newer (A15-class and up). The regression appears to be in the Scaled Dot-Product Attention (SDPA) kernel path in the MPS backend. Environment Affected Mac mini M1 — Macmini9,1 (A14-class) Not affected M2 and newer (A15-class and up) Last known good macOS Sequoia First broken macOS 26 (Tahoe) ? Confirmed broken on macOS 26.3.1 Framework Core ML + MPS backend Language C++ (via CoreML C++ API) Description We ship an audio processing application (VoiceAssist by NoiseWorks) that runs a deep learning model (based on Demucs architecture) via Core ML with the MPS compute unit. On macOS Sequoia this works correctly on all Apple Silicon Macs including M1. After updating to macOS 26 (Tahoe), inference on M1 Macs fails — either producing garbage output or crashing. The same binary, same .mlpackage, same inputs work correctly on M2+. Our Apple contact has suggested the root cause is a regression in the A14-specific MPS SDPA attention kernel, which may have broken when the Metal/MPS stack was updated in macOS 26. The model makes heavy use of attention layers, and the failure correlates precisely with the SDPA path being exercised on A14 hardware. Steps to Reproduce Load a Core ML model that uses Scaled Dot-Product Attention (e.g. a transformer or attention-based audio model) Run inference with MLComputeUnits::cpuAndGPU (MPS active) Run on Mac mini M1 (Macmini9,1) with macOS 26.3.1 Compare output to the same model running on M2 / macOS Sequoia Expected: Correct inference output, consistent with M2+ and macOS Sequoia behavior Actual: Incorrect / corrupted output (or crash), only on A14-class hardware running macOS 26+ Workaround Forcing MLComputeUnits::cpuOnly bypasses MPS entirely and produces correct output on M1, confirming the issue is in the MPS compute path. This is not acceptable as a shipping workaround due to performance impact. Additional Notes The failure is hardware-specific (A14 only) and OS-specific (macOS 26+), pointing to a kernel-level regression rather than a model or app bug We first became aware of this through a customer report Happy to provide a symbolicated crash log if helpful this text was summarized by AI and human verified
Replies
1
Boosts
0
Views
201
Activity
2w
no tensorflow-metal past tf 2.18?
Hi We're on tensorflow 2.20 that has support now for python 3.13 (finally!). tensorflow-metal is still only supporting 2.18 which is over a year old. When can we expect to see support in tensorflow-metal for tf 2.20 (or later!) ? I bought a mac thinking I would be able to get great performance from the M processors but here I am using my CPU for my ML projects. If it's taking so long to release it, why not open source it so the community can keep it more up to date? cheers Matt
Replies
1
Boosts
1
Views
446
Activity
Nov ’25