Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

VNDetectFaceRectanglesRequest does not use the Neural Engine?
I'm on Tahoe 26.1 / M3 Macbook Air. I'm using VNDetectFaceRectanglesRequest as properly as possible, as in the minimal command line program attached below. For some reason, I always get: MLE5Engine is disabled through the configuration printed. I couldn't find any notes on developer docs saying that VNDetectFaceRectanglesRequest can not use the Apple Neural Engine. I'm assuming there is something wrong with my code however I wasn't able to find any remarks from documentation where it might be. I wasn't able to find the above error message online either. I would appreciate your help a lot and thank you in advance. The code below accesses the video from AVCaptureDevice.DeviceType.builtInWideAngleCamera. Currently it directly chooses the 0th format which has the largest resolution (Full HD on my M3 MBA) and "4:2:0" color "v" reduced color component spectrum encoding ("420v"). After accessing video, it performs a VNDetectFaceRectanglesRequest. It prints "VNDetectFaceRectanglesRequest completion Handler called" many times, then prints the error message above, then continues printing "VNDetectFaceRectanglesRequest completion Handler called" until the user quits it. To run it in Xcode, File > New project > Mac command line tool. Pasting the code below, then click on the root file > Targets > Signing & Capabilities > Hardened Runtime > Resource Access > Camera. A possible explanation could be that either Apple's internal CoreML code for this function works on GPU/CPU only or it doesn't accept 420v as supplied by the Macbook Air camera import AVKit import Vision var videoDataOutput: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() var detectionRequests: [VNDetectFaceRectanglesRequest]? var videoDataOutputQueue: DispatchQueue = DispatchQueue(label: "queue") class XYZ: /*NSViewController or NSObject*/NSObject, AVCaptureVideoDataOutputSampleBufferDelegate { func viewDidLoad() { //super.viewDidLoad() let session = AVCaptureSession() let inputDevice = try! self.configureFrontCamera(for: session) self.configureVideoDataOutput(for: inputDevice.device, resolution: inputDevice.resolution, captureSession: session) self.prepareVisionRequest() session.startRunning() } fileprivate func highestResolution420Format(for device: AVCaptureDevice) -> (format: AVCaptureDevice.Format, resolution: CGSize)? { let deviceFormat = device.formats[0] print(deviceFormat) let dims = CMVideoFormatDescriptionGetDimensions(deviceFormat.formatDescription) let resolution = CGSize(width: CGFloat(dims.width), height: CGFloat(dims.height)) return (deviceFormat, resolution) } fileprivate func configureFrontCamera(for captureSession: AVCaptureSession) throws -> (device: AVCaptureDevice, resolution: CGSize) { let deviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [AVCaptureDevice.DeviceType.builtInWideAngleCamera], mediaType: .video, position: AVCaptureDevice.Position.unspecified) let device = deviceDiscoverySession.devices.first! let deviceInput = try! AVCaptureDeviceInput(device: device) captureSession.addInput(deviceInput) let highestResolution = self.highestResolution420Format(for: device)! try! device.lockForConfiguration() device.activeFormat = highestResolution.format device.unlockForConfiguration() return (device, highestResolution.resolution) } fileprivate func configureVideoDataOutput(for inputDevice: AVCaptureDevice, resolution: CGSize, captureSession: AVCaptureSession) { videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue) captureSession.addOutput(videoDataOutput) } fileprivate func prepareVisionRequest() { let faceDetectionRequest: VNDetectFaceRectanglesRequest = VNDetectFaceRectanglesRequest(completionHandler: { (request, error) in print("VNDetectFaceRectanglesRequest completion Handler called") }) // Start with detection detectionRequests = [faceDetectionRequest] } // MARK: AVCaptureVideoDataOutputSampleBufferDelegate // Handle delegate method callback on receiving a sample buffer. public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { var requestHandlerOptions: [VNImageOption: AnyObject] = [:] let cameraIntrinsicData = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil) if cameraIntrinsicData != nil { requestHandlerOptions[VNImageOption.cameraIntrinsics] = cameraIntrinsicData } let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)! // No tracking object detected, so perform initial detection let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: CGImagePropertyOrientation.up, options: requestHandlerOptions) try! imageRequestHandler.perform(detectionRequests!) } } let X = XYZ() X.viewDidLoad() sleep(9999999)
0
0
484
Nov ’25
Visual Intelligence API SemanticContentDescriptor labels are empty
I'm trying to use Apple's new Visual Intelligence API for recommending content through screenshot image search. The problem I encountered is that the SemanticContentDescriptor labels are either completely empty or super misleading, making it impossible to query for similar content on my app. Even the closest matching example was inaccurate, returning a single label ["cardigan"] for a Supreme T-Shirt. I see other apps using this API like Etsy for example, and I'm wondering if they're using the input pixel buffer to query for similar content rather than using the labels? If anyone has a similar experience or something that wasn't called out in the documentation please lmk! Thanks.
1
0
446
Oct ’25
AppShortcuts.xcstrings does not translate each invocation phrase option separately, just the first
Due to our min iOS version, this is my first time using .xcstrings instead of .strings for AppShortcuts. When using the migrate .strings to .xcstrings Xcode context menu option, an .xcstrings catalog is produced that, as expected, has each invocation phrase as a separate string key. However, after compilation, the catalog changes to group all invocation phrases under the first phrase listed for each intent (see attached screenshot). It is possible to hover in blank space on the right and add more translations, but there is no 1:1 key matching requirement to the phrases on the left nor a requirement that there are the same number of keys in one language vs. another. (The lines just happen to align due to my window size.) What does that mean, practically? Do all sub-phrases in each language in AppShortcuts.xcstrings get processed during compilation, even if there isn't an equivalent phrase key declared in the AppShortcut (e.g., the ja translation has more phrases than the English)? (That makes some logical sense, as these phrases need not be 1:1 across languages.) In the AppShortcut declaration, if I delete all but the top invocation phrase, does nothing change with Siri? Is there something I'm doing incorrectly? struct WatchShortcuts: AppShortcutsProvider { static var appShortcuts: [AppShortcut] { AppShortcut( intent: QuickAddWaterIntent(), phrases: [ "\(.applicationName) log water", "\(.applicationName) log my water", "Log water in \(.applicationName)", "Log my water in \(.applicationName)", "Log a bottle of water in \(.applicationName)", ], shortTitle: "Log Water", systemImageName: "drop.fill" ) } }
0
0
325
Aug ’25
Provide actionable feedback for the Foundation Models framework and the on-device LLM
We are really excited to have introduced the Foundation Models framework in WWDC25. When using the framework, you might have feedback about how it can better fit your use cases. Starting in macOS/iOS 26 Beta 4, the best way to provide feedback is to use #Playground in Xcode. To do so: In Xcode, create a playground using #Playground. Fore more information, see Running code snippets using the playground macro. Reproduce the issue by setting up a session and generating a response with your prompt. In the canvas on the right, click the thumbs-up icon to the right of the response. Follow the instructions on the pop-up window and submit your feedback by clicking Share with Apple. Another way to provide your feedback is to file a feedback report with relevant details. Specific to the Foundation Models framework, it’s super important to add the following information in your report: Language model feedback This feedback contains the session transcript, including the instructions, the prompts, the responses, etc. Without that, we can’t reason the model’s behavior, and hence can hardly take any action. Use logFeedbackAttachment(sentiment:issues:desiredOutput: ) to retrieve the feedback data of your current model session, as shown in the usage example, write the data into a file, and then attach the file to your feedback report. If you believe what you’d report is related to the system configuration, please capture a sysdiagnose and attach it to your feedback report as well. The framework is still new. Your actionable feedback helps us evolve the framework quickly, and we appreciate that. Thanks, The Foundation Models framework team
0
0
879
Aug ’25
Updated DetectHandPoseRequest revision from WWDC25 doesn't exist
I watched this year WWDC25 "Read Documents using the Vision framework". At the end of video there is mention of new DetectHandPoseRequest model for hand pose detection in Vision API. I looked Apple documentation and I don't see new revision. Moreover probably typo in video because there is only DetectHumanPoseRequst (swift based) and VNDetectHumanHandPoseRequest (obj-c based) (notice lack of Human prefix in WWDC video) First one have revision only added in iOS 18+: https://aninterestingwebsite.com/documentation/vision/detecthumanhandposerequest/revision-swift.enum/revision1 Second one have revision only added in iOS14+: https://aninterestingwebsite.com/documentation/vision/vndetecthumanhandposerequestrevision1 I don't see any new revision targeting iOS26+
0
0
165
Oct ’25
get error with xcode beta3 :decodingFailure(FoundationModels.LanguageModelSession.GenerationError.Context
@Generable enum Breakfast { case waffles case pancakes case bagels case eggs } do { let session = LanguageModelSession() let userInput = "I want something sweet." let prompt = "Pick the ideal breakfast for request: (userInput)" let response = try await session.respond(to: prompt,generating: Breakfast.self) print(response.content) } catch let error { print(error) } i want to test the @Generable demo but get error with below:decodingFailure(FoundationModels.LanguageModelSession.GenerationError.Context(debugDescription: "Failed to convert text into into GeneratedContent\nText: waffles", underlyingErrors: [Swift.DecodingError.dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid JSON.", underlyingError: Optional(Error Domain=NSCocoaErrorDomain Code=3840 "Unexpected character 'w' around line 1, column 1." UserInfo={NSJSONSerializationErrorIndex=0, NSDebugDescription=Unexpected character 'w' around line 1, column 1.})))]))
1
0
138
Jul ’25
Is it possible to pass the streaming output of Foundation Models down a function chain
I am writing a custom package wrapping Foundation Models which provides a chain-of-thought with intermittent self-evaluation among other things. At first I was designing this package with the command line in mind, but after seeing how well it augments the models and makes them more intelligent I wanted to try and build a SwiftUI wrapper around the package. When I started I was using synchronous generation rather than streaming, but to give the best user experience (as I've seen in the WWDC sessions) it is necessary to provide constant feedback to the user that something is happening. I have created a super simplified example of my setup so it's easier to understand. First, there is the Reasoning conversation item, which can be converted to an XML representation which is then fed back into the model (I've found XML works best for structured input) public typealias ConversationContext = XMLDocument extension ConversationContext { public func toPlainText() -> String { return xmlString(options: [.nodePrettyPrint]) } } /// Represents a reasoning item in a conversation, which includes a title and reasoning content. /// Reasoning items are used to provide detailed explanations or justifications for certain decisions or responses within a conversation. @Generable(description: "A reasoning item in a conversation, containing content and a title.") struct ConversationReasoningItem: ConversationItem { @Guide(description: "The content of the reasoning item, which is your thinking process or explanation") public var reasoningContent: String @Guide(description: "A short summary of the reasoning content, digestible in an interface.") public var title: String @Guide(description: "Indicates whether reasoning is complete") public var done: Bool } extension ConversationReasoningItem: ConversationContextProvider { public func toContext() -> ConversationContext { // <ReasoningItem title="${title}"> // ${reasoningContent} // </ReasoningItem> let root = XMLElement(name: "ReasoningItem") root.addAttribute(XMLNode.attribute(withName: "title", stringValue: title) as! XMLNode) root.stringValue = reasoningContent return ConversationContext(rootElement: root) } } Then there is the generator, which creates a reasoning item from a user query and previously generated items: struct ReasoningItemGenerator { var instructions: String { """ <omitted for brevity> """ } func generate(from input: (String, [ConversationReasoningItem])) async throws -> sending LanguageModelSession.ResponseStream<ConversationReasoningItem> { let session = LanguageModelSession(instructions: instructions) // build the context for the reasoning item out of the user's query and the previous reasoning items let userQuery = "User's query: \(input.0)" let reasoningItemsText = input.1.map { $0.toContext().toPlainText() }.joined(separator: "\n") let context = userQuery + "\n" + reasoningItemsText let reasoningItemResponse = try await session.streamResponse( to: context, generating: ConversationReasoningItem.self) return reasoningItemResponse } } I'm not sure if returning LanguageModelSession.ResponseStream<ConversationReasoningItem> is the right move, I am just trying to imitate what session.streamResponse returns. Then there is the orchestrator, which I can't figure out. It receives the streamed ConversationReasoningItems from the Generator and is responsible for streaming those to SwiftUI later and also for evaluating each reasoning item after it is complete to see if it needs to be regenerated (to keep the model on-track). I want the users of the orchestrator to receive partially generated reasoning items as they are being generated by the generator. Later, when they finish, if the evaluation passes, the item is kept, but if it fails, the reasoning item should be removed from the stream before a new one is generated. So in-flight reasoning items should be outputted aggresively. I really am having trouble figuring this out so if someone with more knowledge about asynchronous stuff in Swift, or- even better- someone who has worked on the Foundation Models framework could point me in the right direction, that would be awesome!
0
0
288
Jul ’25
Apple on-device AI models
Hello, I am studying macOS26 Apple Intelligence features. I have created a basic swift program with Xcode. This program is sending prompts to FoundationModels.LanguageModelSession. It works fine but this model is not trained for programming or code completion. Xcode has an AI code completion feature. It is called "Predictive Code completion model". So, there are multiple on-device models on macOS26 ? Are there others ? Is there a way for me to send prompts to this "Predictive Code completion model" from my program ? Thanks
1
0
324
Oct ’25
InferenceError with Apple Foundation Model – Context Length Exceeded on macOS 26.0 Beta
Hello Team, I'm currently working on a proof of concept using Apple's Foundation Model for a RAG-based chat system on my MacBook Pro with the M1 Max chip. Environment details: macOS: 26.0 Beta Xcode: 26.0 beta 2 (17A5241o) Target platform: iPad (as the iPhone simulator does not support Foundation models) While testing, even with very small input prompts to the LLM, I intermittently encounter the following error: InferenceError::inference-Failed::Failed to run inference: Context length of 4096 was exceeded during singleExtend. Has anyone else experienced this issue? Are there known limitations or workarounds for context length handling in this setup? Any insights would be appreciated. Thank you!
3
0
294
Jul ’25
After loading my custom model - unsupportedTokenizer error
In Oct25, using mlx_lm.lora I created an adapter and a fused model uploaded to Huggingface. I was able to incorporate this model into my SwiftUI app using the mlx package. MLX-libraries 2.25.8. My base LLM was mlx-community/Mistral-7B-Instruct-v0.3-4bit. Looking at LLMModelFactory.swift the current version 2.29.1 the only changes are the addition of a few models. The earlier model was called: pharmpk/pk-mistral-7b-v0.3-4bit The new model is called: pharmpk/pk-mistral-2026-03-29 The base model (mlx-community/Mistral-7B-Instruct-v0.3-4bit.) must still be available. Could the error 'unsupportedTokenizer' be related to changes in the mlx package? I noticed mention of splitting the package into two parts but don't see anything at github. Feeling rather lost. Does anone have any thoguths and/or suggestions. Thanks, David
1
0
176
1w
Request: Official One-Click Local LLM Deployment for 2019 Mac Pro (7,1) Dual W6900X
I am a professional user of the 2019 Mac Pro (7,1) with dual AMD Radeon Pro W6900X MPX modules (32GB VRAM each). This hardware is designed for high-performance compute, but it is currently crippled for modern local LLM/AI workloads under Linux due to Apple's EFI/PCIe routing restrictions. Core Issue: rocminfo reports "No HIP GPUs available" when attempting to use ROCm/amdgpu on Linux Apple's custom EFI firmware blocks full initialization of professional GPU compute assets The dual W6900X GPUs have 64GB combined VRAM and high-bandwidth Infinity Fabric Link, but cannot be fully utilized for local AI inference/training My Specific Request: Apple should provide an official, one-click deployable application that enables full utilization of dual W6900X GPUs for local large language model (LLM) inference and training under Linux. This application must: Fully initialize both W6900X GPUs via HIP/ROCm, establishing valid compute contexts Bypass artificial EFI/PCIe routing restrictions that block access to professional GPU resources Provide a stable, user-friendly one-click deployment experience (similar to NVIDIA's AI Enterprise or AMD's ROCm Hub) Why This Matters: The 2019 Mac Pro is Apple's flagship professional workstation, marketed for compute-intensive workloads. Its high-cost W6900X GPUs should not be locked down for modern AI/LLM use cases. An official one-click deployment solution would demonstrate Apple's commitment to professional AI and unlock significant value for professional users. I look forward to Apple's response and a clear roadmap for enabling this critical capability. #MacPro #Linux #ROCm #LocalLLM #W6900X #CoreML
0
0
92
1w
Plenty of LanguageModelSession.GenerationError.refusal errors after 26.4 update
Hello! After the 26.4 update I get a huge number of LanguageModelSession.GenerationError.refusal errors when using guided generation Generables for inexplicable reasons. Such errors also occur, if I want to cast a response to boolean by using 'generating: Bool.self'. The explanation generated on the grounds of the error always looks like this: Response(userPrompt: "", duration: 0.230917542, promptTokenCount: Optional(66), responseTokenCount: Optional(11), feedbackAttachment: nil, content: "I apologize, but I cannot fulfill this request.", rawContent: "I apologize, but I cannot fulfill this request.", transcriptEntries: ArraySlice([])) All the prompts and Generables I use are definitely not profane. Before 26.4 such errors on the same prompts and Generables never occurred. The 26.4 update rendered those features unusable to me. Is this a known bug or what am I doing wrong?
3
0
463
1w
Powermetrics GPU power vs system DC power discrepancy on M4 Max
While analyzing system power on an M4 Max under GPU-heavy compute workloads, I noticed that the the GPU power reported by powermetrics does not come anywhere close to total system DC power reported by the SMC counter PDTR (as used by utilities like mactop). For example, in a heavy GPU workload, powermetrics would report a 65W idle-load delta on the GPU, but at the same time system DC power would rise by 179W, leaving 114W or nearly 2/3 of total system DC power on a Mac Studio M4 Max unexplained. From measurements, the difference appears to correlate with the amount of on-chip data movement (for example, varying bytes-per-FLOP in the workload changes the observed gap). Using SMC and IOReport, I was able to reverse engineer an energy model for the GPU that explains almost all of the energy flow with less than 2% error on the workload I studied. The result is a simple two-term energy roofline model: P_GPU (GPU_combined term in the plot) ≈ a * bytes + b * FLOPs with: ~5 pJ/byte for SRAM movement ~2.7 pJ/FLOP for compute. Has anyone observed similar behavior, or is there guidance on how GPU power reported by IOReport/powermetrics should be interpreted relative to total system power? In particular, I’m interested in whether certain classes of GPU activity may not be attributed to the GPU component in IOReport. Full details with the methodology and results are available here: https://youtu.be/HKxIGgyeISM
0
0
76
2w
Can MPSGraphExecutable automatically leverage Apple Neural Engine (ANE) for inference?
Hi, I'm currently using Metal Performance Shaders Graph (MPSGraphExecutable) to run neural network inference operations as part of a metal rendering pipeline. I also tried to profile the usage of neural engine when running inference using MPSGraphExecutable but the graph shows no sign of neural engine usage. However, when I used the coreML model inspection tool in xcode and run performance report, it was able to use ANE. Does MPSGraphExecutable automatically utilize the Apple Neural Engine (ANE) when running inference operations, or does it only execute on GPU? My model (Core ML Package) was converted from a pytouch model using coremltools with ML program type and support iOS17.0+. Any insights or documentation references would be greatly appreciated!
0
0
491
Nov ’25
VNDetectFaceRectanglesRequest does not use the Neural Engine?
I'm on Tahoe 26.1 / M3 Macbook Air. I'm using VNDetectFaceRectanglesRequest as properly as possible, as in the minimal command line program attached below. For some reason, I always get: MLE5Engine is disabled through the configuration printed. I couldn't find any notes on developer docs saying that VNDetectFaceRectanglesRequest can not use the Apple Neural Engine. I'm assuming there is something wrong with my code however I wasn't able to find any remarks from documentation where it might be. I wasn't able to find the above error message online either. I would appreciate your help a lot and thank you in advance. The code below accesses the video from AVCaptureDevice.DeviceType.builtInWideAngleCamera. Currently it directly chooses the 0th format which has the largest resolution (Full HD on my M3 MBA) and "4:2:0" color "v" reduced color component spectrum encoding ("420v"). After accessing video, it performs a VNDetectFaceRectanglesRequest. It prints "VNDetectFaceRectanglesRequest completion Handler called" many times, then prints the error message above, then continues printing "VNDetectFaceRectanglesRequest completion Handler called" until the user quits it. To run it in Xcode, File > New project > Mac command line tool. Pasting the code below, then click on the root file > Targets > Signing & Capabilities > Hardened Runtime > Resource Access > Camera. A possible explanation could be that either Apple's internal CoreML code for this function works on GPU/CPU only or it doesn't accept 420v as supplied by the Macbook Air camera import AVKit import Vision var videoDataOutput: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() var detectionRequests: [VNDetectFaceRectanglesRequest]? var videoDataOutputQueue: DispatchQueue = DispatchQueue(label: "queue") class XYZ: /*NSViewController or NSObject*/NSObject, AVCaptureVideoDataOutputSampleBufferDelegate { func viewDidLoad() { //super.viewDidLoad() let session = AVCaptureSession() let inputDevice = try! self.configureFrontCamera(for: session) self.configureVideoDataOutput(for: inputDevice.device, resolution: inputDevice.resolution, captureSession: session) self.prepareVisionRequest() session.startRunning() } fileprivate func highestResolution420Format(for device: AVCaptureDevice) -> (format: AVCaptureDevice.Format, resolution: CGSize)? { let deviceFormat = device.formats[0] print(deviceFormat) let dims = CMVideoFormatDescriptionGetDimensions(deviceFormat.formatDescription) let resolution = CGSize(width: CGFloat(dims.width), height: CGFloat(dims.height)) return (deviceFormat, resolution) } fileprivate func configureFrontCamera(for captureSession: AVCaptureSession) throws -> (device: AVCaptureDevice, resolution: CGSize) { let deviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [AVCaptureDevice.DeviceType.builtInWideAngleCamera], mediaType: .video, position: AVCaptureDevice.Position.unspecified) let device = deviceDiscoverySession.devices.first! let deviceInput = try! AVCaptureDeviceInput(device: device) captureSession.addInput(deviceInput) let highestResolution = self.highestResolution420Format(for: device)! try! device.lockForConfiguration() device.activeFormat = highestResolution.format device.unlockForConfiguration() return (device, highestResolution.resolution) } fileprivate func configureVideoDataOutput(for inputDevice: AVCaptureDevice, resolution: CGSize, captureSession: AVCaptureSession) { videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue) captureSession.addOutput(videoDataOutput) } fileprivate func prepareVisionRequest() { let faceDetectionRequest: VNDetectFaceRectanglesRequest = VNDetectFaceRectanglesRequest(completionHandler: { (request, error) in print("VNDetectFaceRectanglesRequest completion Handler called") }) // Start with detection detectionRequests = [faceDetectionRequest] } // MARK: AVCaptureVideoDataOutputSampleBufferDelegate // Handle delegate method callback on receiving a sample buffer. public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { var requestHandlerOptions: [VNImageOption: AnyObject] = [:] let cameraIntrinsicData = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil) if cameraIntrinsicData != nil { requestHandlerOptions[VNImageOption.cameraIntrinsics] = cameraIntrinsicData } let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)! // No tracking object detected, so perform initial detection let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: CGImagePropertyOrientation.up, options: requestHandlerOptions) try! imageRequestHandler.perform(detectionRequests!) } } let X = XYZ() X.viewDidLoad() sleep(9999999)
Replies
0
Boosts
0
Views
484
Activity
Nov ’25
Visual Intelligence API SemanticContentDescriptor labels are empty
I'm trying to use Apple's new Visual Intelligence API for recommending content through screenshot image search. The problem I encountered is that the SemanticContentDescriptor labels are either completely empty or super misleading, making it impossible to query for similar content on my app. Even the closest matching example was inaccurate, returning a single label ["cardigan"] for a Supreme T-Shirt. I see other apps using this API like Etsy for example, and I'm wondering if they're using the input pixel buffer to query for similar content rather than using the labels? If anyone has a similar experience or something that wasn't called out in the documentation please lmk! Thanks.
Replies
1
Boosts
0
Views
446
Activity
Oct ’25
AppShortcuts.xcstrings does not translate each invocation phrase option separately, just the first
Due to our min iOS version, this is my first time using .xcstrings instead of .strings for AppShortcuts. When using the migrate .strings to .xcstrings Xcode context menu option, an .xcstrings catalog is produced that, as expected, has each invocation phrase as a separate string key. However, after compilation, the catalog changes to group all invocation phrases under the first phrase listed for each intent (see attached screenshot). It is possible to hover in blank space on the right and add more translations, but there is no 1:1 key matching requirement to the phrases on the left nor a requirement that there are the same number of keys in one language vs. another. (The lines just happen to align due to my window size.) What does that mean, practically? Do all sub-phrases in each language in AppShortcuts.xcstrings get processed during compilation, even if there isn't an equivalent phrase key declared in the AppShortcut (e.g., the ja translation has more phrases than the English)? (That makes some logical sense, as these phrases need not be 1:1 across languages.) In the AppShortcut declaration, if I delete all but the top invocation phrase, does nothing change with Siri? Is there something I'm doing incorrectly? struct WatchShortcuts: AppShortcutsProvider { static var appShortcuts: [AppShortcut] { AppShortcut( intent: QuickAddWaterIntent(), phrases: [ "\(.applicationName) log water", "\(.applicationName) log my water", "Log water in \(.applicationName)", "Log my water in \(.applicationName)", "Log a bottle of water in \(.applicationName)", ], shortTitle: "Log Water", systemImageName: "drop.fill" ) } }
Replies
0
Boosts
0
Views
325
Activity
Aug ’25
Asking about computers model always refer to apple.com?
Here's the result: Very weird.
Replies
5
Boosts
0
Views
189
Activity
Jul ’25
Provide actionable feedback for the Foundation Models framework and the on-device LLM
We are really excited to have introduced the Foundation Models framework in WWDC25. When using the framework, you might have feedback about how it can better fit your use cases. Starting in macOS/iOS 26 Beta 4, the best way to provide feedback is to use #Playground in Xcode. To do so: In Xcode, create a playground using #Playground. Fore more information, see Running code snippets using the playground macro. Reproduce the issue by setting up a session and generating a response with your prompt. In the canvas on the right, click the thumbs-up icon to the right of the response. Follow the instructions on the pop-up window and submit your feedback by clicking Share with Apple. Another way to provide your feedback is to file a feedback report with relevant details. Specific to the Foundation Models framework, it’s super important to add the following information in your report: Language model feedback This feedback contains the session transcript, including the instructions, the prompts, the responses, etc. Without that, we can’t reason the model’s behavior, and hence can hardly take any action. Use logFeedbackAttachment(sentiment:issues:desiredOutput: ) to retrieve the feedback data of your current model session, as shown in the usage example, write the data into a file, and then attach the file to your feedback report. If you believe what you’d report is related to the system configuration, please capture a sysdiagnose and attach it to your feedback report as well. The framework is still new. Your actionable feedback helps us evolve the framework quickly, and we appreciate that. Thanks, The Foundation Models framework team
Replies
0
Boosts
0
Views
879
Activity
Aug ’25
Error Domain=NSOSStatusErrorDomain Code=-1 "kCFStreamErrorHTTPParseFailure / kCFSocketError / kCFStreamErrorDomainCustom / kCSIdentityUnknownAuthorityErr / qErr / telGenericError / dsNoExtsMacsBug / kMovieLoadStateError / cdevGenErr: Could not parse
Can't able to run the Create ML for training and I upgraded to MacOS 26.3 beta and I have tried older and newer
Replies
0
Boosts
0
Views
243
Activity
Mar ’26
Updated DetectHandPoseRequest revision from WWDC25 doesn't exist
I watched this year WWDC25 "Read Documents using the Vision framework". At the end of video there is mention of new DetectHandPoseRequest model for hand pose detection in Vision API. I looked Apple documentation and I don't see new revision. Moreover probably typo in video because there is only DetectHumanPoseRequst (swift based) and VNDetectHumanHandPoseRequest (obj-c based) (notice lack of Human prefix in WWDC video) First one have revision only added in iOS 18+: https://aninterestingwebsite.com/documentation/vision/detecthumanhandposerequest/revision-swift.enum/revision1 Second one have revision only added in iOS14+: https://aninterestingwebsite.com/documentation/vision/vndetecthumanhandposerequestrevision1 I don't see any new revision targeting iOS26+
Replies
0
Boosts
0
Views
165
Activity
Oct ’25
get error with xcode beta3 :decodingFailure(FoundationModels.LanguageModelSession.GenerationError.Context
@Generable enum Breakfast { case waffles case pancakes case bagels case eggs } do { let session = LanguageModelSession() let userInput = "I want something sweet." let prompt = "Pick the ideal breakfast for request: (userInput)" let response = try await session.respond(to: prompt,generating: Breakfast.self) print(response.content) } catch let error { print(error) } i want to test the @Generable demo but get error with below:decodingFailure(FoundationModels.LanguageModelSession.GenerationError.Context(debugDescription: "Failed to convert text into into GeneratedContent\nText: waffles", underlyingErrors: [Swift.DecodingError.dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid JSON.", underlyingError: Optional(Error Domain=NSCocoaErrorDomain Code=3840 "Unexpected character 'w' around line 1, column 1." UserInfo={NSJSONSerializationErrorIndex=0, NSDebugDescription=Unexpected character 'w' around line 1, column 1.})))]))
Replies
1
Boosts
0
Views
138
Activity
Jul ’25
Is it possible to pass the streaming output of Foundation Models down a function chain
I am writing a custom package wrapping Foundation Models which provides a chain-of-thought with intermittent self-evaluation among other things. At first I was designing this package with the command line in mind, but after seeing how well it augments the models and makes them more intelligent I wanted to try and build a SwiftUI wrapper around the package. When I started I was using synchronous generation rather than streaming, but to give the best user experience (as I've seen in the WWDC sessions) it is necessary to provide constant feedback to the user that something is happening. I have created a super simplified example of my setup so it's easier to understand. First, there is the Reasoning conversation item, which can be converted to an XML representation which is then fed back into the model (I've found XML works best for structured input) public typealias ConversationContext = XMLDocument extension ConversationContext { public func toPlainText() -> String { return xmlString(options: [.nodePrettyPrint]) } } /// Represents a reasoning item in a conversation, which includes a title and reasoning content. /// Reasoning items are used to provide detailed explanations or justifications for certain decisions or responses within a conversation. @Generable(description: "A reasoning item in a conversation, containing content and a title.") struct ConversationReasoningItem: ConversationItem { @Guide(description: "The content of the reasoning item, which is your thinking process or explanation") public var reasoningContent: String @Guide(description: "A short summary of the reasoning content, digestible in an interface.") public var title: String @Guide(description: "Indicates whether reasoning is complete") public var done: Bool } extension ConversationReasoningItem: ConversationContextProvider { public func toContext() -> ConversationContext { // <ReasoningItem title="${title}"> // ${reasoningContent} // </ReasoningItem> let root = XMLElement(name: "ReasoningItem") root.addAttribute(XMLNode.attribute(withName: "title", stringValue: title) as! XMLNode) root.stringValue = reasoningContent return ConversationContext(rootElement: root) } } Then there is the generator, which creates a reasoning item from a user query and previously generated items: struct ReasoningItemGenerator { var instructions: String { """ <omitted for brevity> """ } func generate(from input: (String, [ConversationReasoningItem])) async throws -> sending LanguageModelSession.ResponseStream<ConversationReasoningItem> { let session = LanguageModelSession(instructions: instructions) // build the context for the reasoning item out of the user's query and the previous reasoning items let userQuery = "User's query: \(input.0)" let reasoningItemsText = input.1.map { $0.toContext().toPlainText() }.joined(separator: "\n") let context = userQuery + "\n" + reasoningItemsText let reasoningItemResponse = try await session.streamResponse( to: context, generating: ConversationReasoningItem.self) return reasoningItemResponse } } I'm not sure if returning LanguageModelSession.ResponseStream<ConversationReasoningItem> is the right move, I am just trying to imitate what session.streamResponse returns. Then there is the orchestrator, which I can't figure out. It receives the streamed ConversationReasoningItems from the Generator and is responsible for streaming those to SwiftUI later and also for evaluating each reasoning item after it is complete to see if it needs to be regenerated (to keep the model on-track). I want the users of the orchestrator to receive partially generated reasoning items as they are being generated by the generator. Later, when they finish, if the evaluation passes, the item is kept, but if it fails, the reasoning item should be removed from the stream before a new one is generated. So in-flight reasoning items should be outputted aggresively. I really am having trouble figuring this out so if someone with more knowledge about asynchronous stuff in Swift, or- even better- someone who has worked on the Foundation Models framework could point me in the right direction, that would be awesome!
Replies
0
Boosts
0
Views
288
Activity
Jul ’25
Unable to use ChatGPT in Xcode
When I use ChatGPT in Xcode, the following error is displayed: It was working fine before, but suddenly it became like this, without changing any configuration. Why?
Replies
2
Boosts
0
Views
378
Activity
Jul ’25
face and body detection is local model or a cloud model?
Is the face and body detection service in the Vision framework a local model or a cloud model? https://aninterestingwebsite.com/documentation/vision
Replies
1
Boosts
0
Views
746
Activity
Sep ’25
Accessibility & Inclusion
When the system language and Siri language are not the same, Apple AI may not be usable. For example, if the system is in English and Siri is in Chinese, it may cause Apple AI to not work. May I ask if there are other reasons why the app still cannot be used internally even after enabling Apple AI?
Replies
0
Boosts
0
Views
499
Activity
Dec ’25
Apple on-device AI models
Hello, I am studying macOS26 Apple Intelligence features. I have created a basic swift program with Xcode. This program is sending prompts to FoundationModels.LanguageModelSession. It works fine but this model is not trained for programming or code completion. Xcode has an AI code completion feature. It is called "Predictive Code completion model". So, there are multiple on-device models on macOS26 ? Are there others ? Is there a way for me to send prompts to this "Predictive Code completion model" from my program ? Thanks
Replies
1
Boosts
0
Views
324
Activity
Oct ’25
InferenceError with Apple Foundation Model – Context Length Exceeded on macOS 26.0 Beta
Hello Team, I'm currently working on a proof of concept using Apple's Foundation Model for a RAG-based chat system on my MacBook Pro with the M1 Max chip. Environment details: macOS: 26.0 Beta Xcode: 26.0 beta 2 (17A5241o) Target platform: iPad (as the iPhone simulator does not support Foundation models) While testing, even with very small input prompts to the LLM, I intermittently encounter the following error: InferenceError::inference-Failed::Failed to run inference: Context length of 4096 was exceeded during singleExtend. Has anyone else experienced this issue? Are there known limitations or workarounds for context length handling in this setup? Any insights would be appreciated. Thank you!
Replies
3
Boosts
0
Views
294
Activity
Jul ’25
After loading my custom model - unsupportedTokenizer error
In Oct25, using mlx_lm.lora I created an adapter and a fused model uploaded to Huggingface. I was able to incorporate this model into my SwiftUI app using the mlx package. MLX-libraries 2.25.8. My base LLM was mlx-community/Mistral-7B-Instruct-v0.3-4bit. Looking at LLMModelFactory.swift the current version 2.29.1 the only changes are the addition of a few models. The earlier model was called: pharmpk/pk-mistral-7b-v0.3-4bit The new model is called: pharmpk/pk-mistral-2026-03-29 The base model (mlx-community/Mistral-7B-Instruct-v0.3-4bit.) must still be available. Could the error 'unsupportedTokenizer' be related to changes in the mlx package? I noticed mention of splitting the package into two parts but don't see anything at github. Feeling rather lost. Does anone have any thoguths and/or suggestions. Thanks, David
Replies
1
Boosts
0
Views
176
Activity
1w
Request: Official One-Click Local LLM Deployment for 2019 Mac Pro (7,1) Dual W6900X
I am a professional user of the 2019 Mac Pro (7,1) with dual AMD Radeon Pro W6900X MPX modules (32GB VRAM each). This hardware is designed for high-performance compute, but it is currently crippled for modern local LLM/AI workloads under Linux due to Apple's EFI/PCIe routing restrictions. Core Issue: rocminfo reports "No HIP GPUs available" when attempting to use ROCm/amdgpu on Linux Apple's custom EFI firmware blocks full initialization of professional GPU compute assets The dual W6900X GPUs have 64GB combined VRAM and high-bandwidth Infinity Fabric Link, but cannot be fully utilized for local AI inference/training My Specific Request: Apple should provide an official, one-click deployable application that enables full utilization of dual W6900X GPUs for local large language model (LLM) inference and training under Linux. This application must: Fully initialize both W6900X GPUs via HIP/ROCm, establishing valid compute contexts Bypass artificial EFI/PCIe routing restrictions that block access to professional GPU resources Provide a stable, user-friendly one-click deployment experience (similar to NVIDIA's AI Enterprise or AMD's ROCm Hub) Why This Matters: The 2019 Mac Pro is Apple's flagship professional workstation, marketed for compute-intensive workloads. Its high-cost W6900X GPUs should not be locked down for modern AI/LLM use cases. An official one-click deployment solution would demonstrate Apple's commitment to professional AI and unlock significant value for professional users. I look forward to Apple's response and a clear roadmap for enabling this critical capability. #MacPro #Linux #ROCm #LocalLLM #W6900X #CoreML
Replies
0
Boosts
0
Views
92
Activity
1w
Plenty of LanguageModelSession.GenerationError.refusal errors after 26.4 update
Hello! After the 26.4 update I get a huge number of LanguageModelSession.GenerationError.refusal errors when using guided generation Generables for inexplicable reasons. Such errors also occur, if I want to cast a response to boolean by using 'generating: Bool.self'. The explanation generated on the grounds of the error always looks like this: Response(userPrompt: "", duration: 0.230917542, promptTokenCount: Optional(66), responseTokenCount: Optional(11), feedbackAttachment: nil, content: "I apologize, but I cannot fulfill this request.", rawContent: "I apologize, but I cannot fulfill this request.", transcriptEntries: ArraySlice([])) All the prompts and Generables I use are definitely not profane. Before 26.4 such errors on the same prompts and Generables never occurred. The 26.4 update rendered those features unusable to me. Is this a known bug or what am I doing wrong?
Replies
3
Boosts
0
Views
463
Activity
1w
Gemini2.5Flash with Json
I am using gemini2.5-flash with SwiftUI. How can I receive a response in JSON?
Replies
0
Boosts
0
Views
210
Activity
Jul ’25
Powermetrics GPU power vs system DC power discrepancy on M4 Max
While analyzing system power on an M4 Max under GPU-heavy compute workloads, I noticed that the the GPU power reported by powermetrics does not come anywhere close to total system DC power reported by the SMC counter PDTR (as used by utilities like mactop). For example, in a heavy GPU workload, powermetrics would report a 65W idle-load delta on the GPU, but at the same time system DC power would rise by 179W, leaving 114W or nearly 2/3 of total system DC power on a Mac Studio M4 Max unexplained. From measurements, the difference appears to correlate with the amount of on-chip data movement (for example, varying bytes-per-FLOP in the workload changes the observed gap). Using SMC and IOReport, I was able to reverse engineer an energy model for the GPU that explains almost all of the energy flow with less than 2% error on the workload I studied. The result is a simple two-term energy roofline model: P_GPU (GPU_combined term in the plot) ≈ a * bytes + b * FLOPs with: ~5 pJ/byte for SRAM movement ~2.7 pJ/FLOP for compute. Has anyone observed similar behavior, or is there guidance on how GPU power reported by IOReport/powermetrics should be interpreted relative to total system power? In particular, I’m interested in whether certain classes of GPU activity may not be attributed to the GPU component in IOReport. Full details with the methodology and results are available here: https://youtu.be/HKxIGgyeISM
Replies
0
Boosts
0
Views
76
Activity
2w
Can MPSGraphExecutable automatically leverage Apple Neural Engine (ANE) for inference?
Hi, I'm currently using Metal Performance Shaders Graph (MPSGraphExecutable) to run neural network inference operations as part of a metal rendering pipeline. I also tried to profile the usage of neural engine when running inference using MPSGraphExecutable but the graph shows no sign of neural engine usage. However, when I used the coreML model inspection tool in xcode and run performance report, it was able to use ANE. Does MPSGraphExecutable automatically utilize the Apple Neural Engine (ANE) when running inference operations, or does it only execute on GPU? My model (Core ML Package) was converted from a pytouch model using coremltools with ML program type and support iOS17.0+. Any insights or documentation references would be greatly appreciated!
Replies
0
Boosts
0
Views
491
Activity
Nov ’25