Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.

All subtopics
Posts under Media Technologies topic

Post

Replies

Boosts

Views

Activity

Indicate Packet Loss With AVAudioConverter for OPUS Decoding
I'm using an AVAudioConverter object to decode an OPUS stream for VoIP. The decoding itself works well, however, whenever the stream stalls (no more audio packet is available to decode because of network instability) this can be heard in crackling / abrupt stop in decoded audio. OPUS can mitigate this by indicating packet loss by passing a null pointer in the C-library to int opus_decode_float (OpusDecoder * st, const unsigned char * data, opus_int32 len, float * pcm, int frame_size, int decode_fec), see https://opus-codec.org/docs/opus_api-1.2/group__opus__decoder.html#ga9c554b8c0214e24733a299fe53bb3bd2. However, with AVAudioConverter using Swift I'm constructing an AVAudioCompressedBuffer like so:         let compressedBuffer = AVAudioCompressedBuffer(             format: VoiceEncoder.Constants.networkFormat,             packetCapacity: 1,             maximumPacketSize: data.count         )         compressedBuffer.byteLength = UInt32(data.count)         compressedBuffer.packetCount = 1   compressedBuffer.packetDescriptions! .pointee.mDataByteSize = UInt32(data.count)         data.copyBytes(             to: compressedBuffer.data .assumingMemoryBound(to: UInt8.self),             count: data.count         ) where data: Data contains the raw OPUS frame to be decoded. How can I specify data loss in this context and cause the AVAudioConverter to output PCM data whenever no more input data is available? More context: I'm specifying the audio format like this:         static let frameSize: UInt32 = 960         static let sampleRate: Float64 = 48000.0         static var networkFormatStreamDescription = AudioStreamBasicDescription(             mSampleRate: sampleRate,             mFormatID: kAudioFormatOpus,             mFormatFlags: 0,             mBytesPerPacket: 0,             mFramesPerPacket: frameSize,             mBytesPerFrame: 0,             mChannelsPerFrame: 1,             mBitsPerChannel: 0,             mReserved: 0         )         static let networkFormat = AVAudioFormat( streamDescription: &networkFormatStreamDescription )! I've tried 1) setting byteLength and packetCount to zero and 2) returning nil but setting .haveData in the AVAudioConverterInputBlock I'm using with no success.
1
1
928
May ’25
AVAudioEngine fails to start during FaceTime call (error 2003329396)
Is it possible to perform speech-to-text using AVAudioEngine to capture microphone input while being on a FaceTime call at the same time? I tried implementing this, but whenever I attempt to start the  AVAudioEngine  while a FaceTime call is active, I get the following error: “The operation couldn’t be completed. (OSStatus error 2003329396)” I assume this might be due to microphone resource restrictions during FaceTime, but I’d like to confirm whether this limitation is at the system level or if there’s any possible workaround or entitlement that allows concurrent microphone access. Has anyone encountered this issue or found a solution?
2
1
755
2w
Inquiries about AVPlayer's ABR switching logic and control APIs for HLS/LL-HLS
Hello, I am developing a custom player SDK using AVPlayer to support HLS and LL-HLS live streaming. I have some questions about the internal logic of AVPlayer regarding ABR, as this information is not explicitly covered in the documentation. ABR Switching Logic: Does AVPlayer trigger bitrate switching primarily based on stall occurrences (buffer starvation)? I am curious if the switching logic is reactive to stalls or if it proactively switches to prevent them based on throughput estimation. Developer Controls for ABR: To influence or control the ABR selection, are preferredPeakBitRate and preferredForwardBufferDuration the only properties available to developers? Are there any other recommended APIs to assist with ABR decisions? Thank you for your help.
2
0
515
Jan ’26
iOS Speech Error on Mobile Simulator (Error fetching voices)
I'm writing a simple app for iOS and I'd like to be able to do some text to speech in it. I have a basic audio manager class with a "speak" function: import Foundation import AVFoundation class AudioManager { static let shared = AudioManager() var audioPlayer: AVAudioPlayer? var isPlaying: Bool { return audioPlayer?.isPlaying ?? false } var playbackPosition: TimeInterval = 0 func playSound(named name: String) { guard let url = Bundle.main.url(forResource: name, withExtension: "mp3") else { print("Sound file not found") return } do { if audioPlayer == nil || !isPlaying { audioPlayer = try AVAudioPlayer(contentsOf: url) audioPlayer?.currentTime = playbackPosition audioPlayer?.prepareToPlay() audioPlayer?.play() } else { print("Sound is already playing") } } catch { print("Error playing sound: \(error.localizedDescription)") } } func stopSound() { if let player = audioPlayer { playbackPosition = player.currentTime player.stop() } } func speak(text: String) { let synthesizer = AVSpeechSynthesizer() let utterance = AVSpeechUtterance(string: text) utterance.voice = AVSpeechSynthesisVoice(language: "en-GB") synthesizer.speak(utterance) } } And my app shows text in a ScrollView: ScrollView { Text(self.description) .padding() .foregroundColor(.black) .font(.headline) .background(Color.gray.opacity(0)) }.onAppear { AudioManager.shared.speak(text: self.description) } However, the text doesn't get read out (in the simulator). I see some output in the console: Error fetching voices: Swift.DecodingError.dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "Invalid container metadata for _UnkeyedDecodingContainer, found keyedGraphEncodingNodeID", underlyingError: nil)). Using fallback voices. I'm probably doing something wrong here, but not sure what.
5
1
698
Dec ’25
AVAudioFile.read extremely slow after seeking in FLAC and MP3 files
I'm developing an audio player app that uses AVAudio​File to read PCM data from various formats. I'm experiencing severe performance issues when seeking in FLAC, while other compressed formats (M4A/AAC) work correctly. I don't intend to use them in my app, but I also tested mp3 files just by curiosity and they also have this issue. Environment: macOS 26 (Tahoe) Xcode 26.3 Apple Silicon (M1) The issue: After setting AVAudio​File​.frame​Position to a position mid-file, the subsequent call to AVAudio​File​.read(into​:frame​Count:) blocks for an unreasonable amount of time for FLAC and MP3 files. The delay scales linearly with the seek target, seeking near the beginning is fast, seeking toward the end is proportionally slower, which suggests the decoder is decoding linearly from the beginning of the file rather than using any seek index. (My app deals with “images” of Audio CDs ripped as a single long audio file.) The issue is particularly severe when reading files from an SMB network share (server on Ethernet, client on Wi-Fi with the access point ~2 meters away in line of sight). Quick Benchmark results: I tested with the same 75-minute audio content (16-bit/44.1 kHz stereo, 200,502,708 frames) encoded in five formats, seeking to the midpoint. Over SMB (Local Network, Server on Ethernet, Client on WiFi): Format | Seek + Read Time ----------|------------------ WAV | 0.007 s AIFF | 0.009 s Apple | 0.015 s Lossless | MP3 | 9.2 s FLAC | 30.2 s Locally (MacBook Air M1 SSD) : Format | Seek + Read Time ----------|------------------ WAV | 0.0005 s AIFF | 0.0004 s Apple | 0.0011 s Lossless | MP3 | 0.1958 s FLAC | 0.7528 s WAV, AIFF, and M4A all seek virtually instantly (< 15 ms). MP3 and FLAC exhibit linear-time behavior, with FLAC being the worst affected. Note that M4A (AAC) is also a compressed format that requires decoding after seeking, yet it completes in 15 ms. This rules out any inherent limitation of compressed formats, the MP4 container's packet index (stts/stco) is clearly being used for fast random access. Both MP3 (Xing/LAME TOC) and FLAC (SEEKTABLE metadata block) have their own seek mechanisms that should provide similar performance. Minimal CLI tool to reproduce: import Foundation guard CommandLine.arguments.count > 1 else { print("Usage: FLACSpeed <audio-file-path>") exit(1) } let path = CommandLine.arguments[1] let fileURL = URL(fileURLWithPath: path) do { let file = try AVAudioFile(forReading: fileURL) let format = file.processingFormat let buffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: 8192)! let totalFrames = file.length let seekTarget = totalFrames / 2 print("File: \(fileURL.lastPathComponent)") print("Format: \(format)") print("Total frames: \(totalFrames)") print("Seeking to frame: \(seekTarget)") file.framePosition = seekTarget let start = CFAbsoluteTimeGetCurrent() try file.read(into: buffer, frameCount: 8192) let elapsed = CFAbsoluteTimeGetCurrent() - start print("Read after seek took \(elapsed) seconds") } catch { print("Error: \(error.localizedDescription)") exit(1) } Expected behavior: AVAudio​File​.read(into​:frame​Count:) after setting frame​Position should use the available seek mechanisms in FLAC and MP3 files for fast random access, as it already does for M4A (AAC). Even accounting for the fact that seek tables provide approximate (not sample-precise) positioning, the "jump to nearest index point + decode forward" approach should complete in milliseconds, not seconds. Workaround: For FLAC, I've worked around this by using libFLAC directly, which provides instant seeking via FLAC__stream​_decoder​_seek​_absolute(). libFLAC Performance: For comparison, libFLAC's FLAC__stream​_decoder​_seek​_absolute() performs the same seek + read on the same FLAC file in around 0.015, using the FLAC seek table to jump to the nearest preceding seek point, then decoding forward a small number of frames to the exact target sample.
1
1
188
1d
AVAudioRecorder loses audio recorded before interruption
Hi everyone, I'm running into an issue with AVAudioRecorder when handling interruptions such as phone calls or alarms. Problem: When the app is recording audio and an interruption occurs: I handle the interruption with audioRecorder?.pause() inside AVAudioSession.interruptionNotification (on .began). On .ended, I check for .shouldResume and call audioRecorder?.record() again. The recorder resumes successfully, but only the audio recorded after the interruption is saved. The audio recorded before the interruption is lost, even though I'm using the same file URL and not recreating the recorder. Repro: Start a recording with AVAudioRecorder Simulate a system interruption (e.g., incoming call) Resume recording after the interruption Stop and inspect the output audio file Expected: Full audio (before and after interruption) should be saved. Actual: Only the audio after interruption is saved; the earlier part is missing Notes: According to the documentation, calling .record() after .pause() should resume recording into the same file. I confirmed that the file URL does not change, and I do not recreate the recorder instance. No error is thrown by the system during this process. This behavior happens consistently when the app is interrupted and resumed. Question: Is this a known issue? Is there a recommended workaround for preserving the full recording when interruptions happen? Thanks in advance!
1
1
393
Dec ’25
Orientation does not work on iPhone 17 and above.
I'm receiving output from avcapturesession and capturing an image using Vision, but the image is output in landscape orientation instead of portrait. Even when I set the orientation to up in ciimage, cgimage, and uiimage, the image is still output in landscape orientation. On iPhones 16 and below, the image is output in portrait orientation. But on iPhones 17 and above, the image is output in landscape orientation. Please help.
1
1
433
Dec ’25
Dockkit custom tracking does not work on iOS18.3
Hi all, I'm using Apple Sample Code below to create application using dockkit. "Controlling a DockKit accessory using your camera app" https://aninterestingwebsite.com/documentation/dockkit/controlling-a-dockkit-accessory-using-your-camera-app?changes=_8 I used vision hand recognition and put the observation data to dockAccessory.track, but Belkin or Insta360 devices never move on iPhone 16 Pro Max with iOS 18.3. If I use other functions like face search (system tracking) in the app, those work ok. I used Belkin and Insta360 Flow 2 Pro to reproduce the problem. My friend is also saying that the custom tracking feature was working fine on the OS 18 beta, but on recent iOS 18.3 that feature does not work. If I can get the iOS 18.0 beta then we can test that feature. But I cannot revert my iOS from 18.3 to the iOS 18.0 Beta. Regards, TO
1
1
348
Dec ’25
Offline Fairplay Error -42650
We have implemented offline Fairplay playback and it works fine.But at times when trying to playback the offline downloaded content, we get the following error"An unknown error occured (-42650)"Tried looking up the error in the documentation but couldnt find anything relevant.What could possibly be creating this error?
6
2
3.2k
4w
Swift Array Out of Bounds Crash in VTFrameProcessor when using VTLowLatencyFrameInterpolationParameters
Hi everyone, Our team is encountering a reproducible crash when using VTLowLatencyFrameInterpolation on iOS 26.3 while processing a live LL-HLS input stream. 🤖 Environment Device: iPhone 16 OS: iOS 26.3 Xcode: Xcode 26.3 Framework: VideoToolbox 💥 Crash Details The application crashes with the following fatal error: Fatal error: Swift/ContiguousArrayBuffer.swift:184: Array index out of range The stack trace highlights the following: VTLowLatencyFrameInterpolationImplementation processWithParameters:frameOutputHandler: Called from VTFrameProcessor.process(parameters:) Here is the simplified implementation block where the crash occurs. (Note: PrismSampleBuffer and PrismLLFIError are our internal custom wrapper types). // Create `VTFrameProcessorFrame` for the source (previous) frame. let sourcePTS = sourceSampleBuffer.presentationTimeStamp var sourceFrame: VTFrameProcessorFrame? if let pixelBuffer = sourceSampleBuffer.imageBuffer { sourceFrame = VTFrameProcessorFrame(buffer: pixelBuffer, presentationTimeStamp: sourcePTS) } // Validate the source VTFrameProcessorFrame. guard let sourceFrame else { throw PrismLLFIError.missingImageBuffer } // Create `VTFrameProcessorFrame` for the next frame. let nextPTS = nextSampleBuffer.presentationTimeStamp var nextFrame: VTFrameProcessorFrame? if let pixelBuffer = nextSampleBuffer.imageBuffer { nextFrame = VTFrameProcessorFrame(buffer: pixelBuffer, presentationTimeStamp: nextPTS) } // Validate the next VTFrameProcessorFrame. guard let nextFrame else { throw PrismLLFIError.missingImageBuffer } // Calculate interpolation intervals and allocate destination frame buffers. let intervals = interpolationIntervals() let destinationFrames = try framesBetween(firstPTS: sourcePTS, lastPTS: nextPTS, interpolationIntervals: intervals) let interpolationPhase: [Float] = intervals.map { Float($0) } // Create VTLowLatencyFrameInterpolationParameters. // This sets up the configuration required for temporal frame interpolation between the previous and current source frames. guard let parameters = VTLowLatencyFrameInterpolationParameters( sourceFrame: nextFrame, previousFrame: sourceFrame, interpolationPhase: interpolationPhase, destinationFrames: destinationFrames ) else { throw PrismLLFIError.failedToCreateParameters } try await send(sourceSampleBuffer) // Process the frames. // Using progressive callback here to get the next processed frame as soon as it's ready, // preventing the system from waiting for the entire batch to finish. for try await readOnlyFrame in self.frameProcessor.process(parameters: parameters) { // Create an interpolated sample buffer based on the output frame. let newSampleBuffer: PrismSampleBuffer = try readOnlyFrame.frame.withUnsafeBuffer { pixelBuffer in try PrismLowLatencyFrameInterpolation.createSampleBuffer(from: pixelBuffer, readOnlyFrame.timeStamp) } // Pass the newly generated frame to the output stream. try await send(newSampleBuffer) } 🙋 Questions Are there any known limitations or bugs regarding VTLowLatencyFrameInterpolation when handling live 60fps streams? Are there any undocumented constraints we should be aware of regarding source/previous frame timing, pixel buffer attributes, or how destinationFrames and interpolationPhase arrays must be allocated? Is a "warm-up" sequence recommended after startSession() before making the first process(parameters:) call?
1
0
472
3w
MusicKit - ApplicationMusicPlayer fails to play certain Songs
[Note: this issue was happening on a main testing device, and after testing the same code on other devices, this issue is only happening on 1 out of 4 devices] We are successfully getting a MusicCatalogResourceResponse for every song ID where we make the MusicCatalogResourceRequest. We are able to display the song title, artist name, and album artwork for each Song in the response. However - when we go to play the song, there are some songs that play, and several songs that do not play. For the songs that don't play, the console shows “Failed to prepareToPlay error=<MPMusicPlayerControllerErrorDomain.6 "Failed to prepare to play" {}>” let musicPlayer = ApplicationMusicPlayer.shared func playSong(_ song: Song) { musicPlayer.queue = [song] Task { try await musicPlayer.prepareToPlay() try await musicPlayer.play() } Is there anything else we can investigate about what may be causing specific song IDs not to play on this specific device? Even if we remove line 6 musicPlayer.prepareToPlay() we still see the same console error when running playSong with the Songs that don't work. It is always the same song IDs that we can play and always the same song IDs that we cannot get to play, even trying them across different projects with different bundle identifiers. We can tap to play a song that works, and it starts playing immediately. Then tap a song that doesn't work, and nothing happens. Then back to a song that works. It's consistent which songs succeed and fail on this device. Perhaps there is an issue specific to this very iPad when it comes to certain specific songs, but we'd like to be confident that an app relying on MusicKit will be able to play songs that have been successfully loaded with a MusicCatalogResourceResponse. Thanks for any help or suggestions about what we may be able to investigate further on the device or what we should consider when launching an app that expects anyone with Apple Music to be able to listen to any of the songs loaded by the app. Specific iPad details: iPad Pro (12.9-inch) (6th generation) running iPadOS 26.4 Beta Two of the song IDs that won't play on this iPad (even though we can access and display their album artwork and all other information): 943204000 and 1441164805
2
1
216
Mar ’26
Apple Music playlist create/delete works but DELETE returns 401 — and MusicKit write APIs are macOS‑unavailable. How to build a playlist editor on macOS?
I’m trying to build a playlist editor on macOS. I can create playlists via the Apple Music HTTP API, but DELETE always returns 401 even immediately after creation with the same tokens. Minimal repro: #!/usr/bin/env bash set -euo pipefail BASE_URL="https://api.music.apple.com/v1" PLAYLIST_NAME="${PLAYLIST_NAME:-blah}" : "${APPLE_MUSIC_DEV_TOKEN:?}" : "${APPLE_MUSIC_USER_TOKEN:?}" create_body="$(mktemp)" delete_body="$(mktemp)" trap 'rm -f "$create_body" "$delete_body"' EXIT curl -sS --compressed -o "$create_body" -w "Create status: %{http_code}\n" \ -X POST "${BASE_URL}/me/library/playlists" \ -H "Authorization: Bearer ${APPLE_MUSIC_DEV_TOKEN}" \ -H "Music-User-Token: ${APPLE_MUSIC_USER_TOKEN}" \ -H "Content-Type: application/json" \ -d "{\"attributes\":{\"name\":\"${PLAYLIST_NAME}\"}}" playlist_id="$(python3 - "$create_body" <<'PY' import json, sys with open(sys.argv[1], "r", encoding="utf-8") as f: data = json.load(f) print(data["data"][0]["id"]) PY )" curl -sS --compressed -o "$delete_body" -w "Delete status: %{http_code}\n" \ -X DELETE "${BASE_URL}/me/library/playlists/${playlist_id}" \ -H "Authorization: Bearer ${APPLE_MUSIC_DEV_TOKEN}" \ -H "Music-User-Token: ${APPLE_MUSIC_USER_TOKEN}" \ -H "Content-Type: application/json" I capture the response bodies like this: cat "$create_body" cat "$delete_body" Result: Create: 201 Delete: 401 I also checked the latest macOS SDK’s MusicKit interfaces and MusicLibrary.createPlaylist/edit/add(to:) are marked @available(macOS, unavailable), so I can’t create/ delete via MusicKit on macOS either. Question: How can I implement a playlist editor on macOS (create/delete/modify) if: MusicKit write APIs are unavailable on macOS, and The HTTP API can create but DELETE returns 401? Any guidance or official workaround would be hugely appreciated.
0
1
331
Jan ’26
Changing Frame Rate of External Display on iPad
Hello, As far as I know and in all of my testing there is no way for a user or a developer to change the frame rate of the video output on iPadOS. If you connect an iPad via a USB Hub or a USB to HDMI Adaptor and then connect it to an external monitor it will output at 59.94fps. I have a video app where a user monitors live video at 25fps and 30fps, they often output to an external display and there are times when the external display will stutter due to the mismatch in frame rate, ie. using 25fps and outputting at 59.94fps. I thought it was impossible to change the video output frame rate, then in V3.1 of the Blackmagic Camera App I saw an interesting change in their release notes: ‘Support for HDMI Monitoring at Sensor Rate and Resolution’ This means there is some way to modify it, not sure if this is done via a Private API that Apple has allowed Blackmagic to use. If so, how can we access this or is there a way to enable this that is undocumented? Thanks!
6
0
1k
Jan ’26
Start and stop recording Voice Memos with Siri
using iOS 26.2; Airpods 4 Long press stem to launch Siri Speak "Record Voice Memo" -> Recording starts Recording in progress... Long press stem to launch Siri -> Nothing happens. To stop recording need use phone. is this intended behaviour? i would like to be able to stop recording with Siri I am able to launch Siri from phone while recording, but point is to keep phone in pocket and start/stop recordings only via Airpods.
1
0
189
Dec ’25
AVCaptureMetadataOutput .face detection not working on iOS 26 Beta with high sessionPreset
In iOS 26 (Developer Beta), the AVCaptureMetadataOutputObjectsDelegate no longer receives callbacks when metadataOutput.metadataObjectTypes = [.face] is set. On earlier iOS versions the issue does not occur. Interestingly, face detection works if I set the sessionPreset to .medium, but not with .high — except on the iPhone 16 Pro Max, where it works regardless.
2
1
370
Sep ’25
Background uploading extension's identifier not official yet
Recently Apple gave us the possibility to upload asset resources in the background. We implemented our background upload extension but when our CI tried to upload the app on TestFlight we got an error that the extension point identifier - in our case com.apple.photos.backgound-upload - is not an official one. Any idea when it will become official and we will be able to release a working background uploading?
1
0
213
Nov ’25
DockKit gimbal reported yaw drifts by upwards of 45 degrees after running for a while
This is an issue with the Insta360 Flow Pro 2. My iOS app uses DockKit to control the gimbal; in particular, my app disables tracking and sends angular velocity commands to control the gimbal's orientation. I only try to modify the yaw (rotation around the vertical axis); never the pitch or yaw. Note that I don't send the gimbal to a particular orientation directly; I modify the velocity. Everything works great for a long period of time: typically for a continuous run of 4-6 hours; in the most recent case, I managed about 36 hours of continous operation before the following problem occurred. I came back to check on the system, and because no visual activity had occurred in the camera's field of view for a while, the phone had commanded the gimbal to rotate back to a yaw angle of 0 degrees. So the phone in the gimbal should have been looking straight ahead (i.e. the 0 degree yaw position), but it was definitely looking off at an angle. I've seen this twice now. The first time, when it should have been looking straight ahead, it was in fact looking 60 degrees off center. This time (caught on video, see below), it was off by 22 degrees from center. Here's the weird part: the gimbal reports this way off center positioning as zero degrees (well close enough to zero, like 0.2 or something that's fine). But, mechanically, the gimbal still knows where zero degrees is: if we double click on the trigger of the Flow Pro 2, which is supposed to reset the gimbal to 0 degrees yaw and pitch, the gimbal responds correctly and reorients to a 0 degree position. However, the yaw values it reports are not zero, but as shown in my video, 22 degrees off axis or so. Power cycling the gimbal and restarting immediately fixes the problem. Also, I switched from my app to the Insta360 app, which caused the phone to flip from landscape to portrait, then when I returned to my app and switched back to landscape, the gimbal now started reporting correct yaw angles. Is there a possibility this is a bug in the DockKit framework? Has anyone seen this? I have a case open with Insta360, but although it's clearly a software issue, it's not clear if it's in Insta360's code or the DockKit layer. Any ideas for how I can get out of this mode? My concern is that the phone is in a tripod about 10' off the floor, and not very accessible. Also, if all goes well, we may have about 50 of these systems running, and having to fix them one by one after a few hours is not good. For a demonstration of this bug, see the following video: https://octoparry.com/offset.MOV Any help greatly appreciated.
4
0
565
Jun ’25
[26] audioTimeRange would still be interesting for .volatileResults in SpeechTranscriber
So experimenting with the new SpeechTranscriber, if I do: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange] ) only the final result has audio time ranges, not the volatile results. Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses. I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while. The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
6
0
763
Nov ’25
Indicate Packet Loss With AVAudioConverter for OPUS Decoding
I'm using an AVAudioConverter object to decode an OPUS stream for VoIP. The decoding itself works well, however, whenever the stream stalls (no more audio packet is available to decode because of network instability) this can be heard in crackling / abrupt stop in decoded audio. OPUS can mitigate this by indicating packet loss by passing a null pointer in the C-library to int opus_decode_float (OpusDecoder * st, const unsigned char * data, opus_int32 len, float * pcm, int frame_size, int decode_fec), see https://opus-codec.org/docs/opus_api-1.2/group__opus__decoder.html#ga9c554b8c0214e24733a299fe53bb3bd2. However, with AVAudioConverter using Swift I'm constructing an AVAudioCompressedBuffer like so:         let compressedBuffer = AVAudioCompressedBuffer(             format: VoiceEncoder.Constants.networkFormat,             packetCapacity: 1,             maximumPacketSize: data.count         )         compressedBuffer.byteLength = UInt32(data.count)         compressedBuffer.packetCount = 1   compressedBuffer.packetDescriptions! .pointee.mDataByteSize = UInt32(data.count)         data.copyBytes(             to: compressedBuffer.data .assumingMemoryBound(to: UInt8.self),             count: data.count         ) where data: Data contains the raw OPUS frame to be decoded. How can I specify data loss in this context and cause the AVAudioConverter to output PCM data whenever no more input data is available? More context: I'm specifying the audio format like this:         static let frameSize: UInt32 = 960         static let sampleRate: Float64 = 48000.0         static var networkFormatStreamDescription = AudioStreamBasicDescription(             mSampleRate: sampleRate,             mFormatID: kAudioFormatOpus,             mFormatFlags: 0,             mBytesPerPacket: 0,             mFramesPerPacket: frameSize,             mBytesPerFrame: 0,             mChannelsPerFrame: 1,             mBitsPerChannel: 0,             mReserved: 0         )         static let networkFormat = AVAudioFormat( streamDescription: &networkFormatStreamDescription )! I've tried 1) setting byteLength and packetCount to zero and 2) returning nil but setting .haveData in the AVAudioConverterInputBlock I'm using with no success.
Replies
1
Boosts
1
Views
928
Activity
May ’25
AVAudioEngine fails to start during FaceTime call (error 2003329396)
Is it possible to perform speech-to-text using AVAudioEngine to capture microphone input while being on a FaceTime call at the same time? I tried implementing this, but whenever I attempt to start the  AVAudioEngine  while a FaceTime call is active, I get the following error: “The operation couldn’t be completed. (OSStatus error 2003329396)” I assume this might be due to microphone resource restrictions during FaceTime, but I’d like to confirm whether this limitation is at the system level or if there’s any possible workaround or entitlement that allows concurrent microphone access. Has anyone encountered this issue or found a solution?
Replies
2
Boosts
1
Views
755
Activity
2w
Ability to programatically download Premium and Enhanced voices
Please consider adding the ability to programatically download Premium and Enhanced voices. At the moment it is extremely inconvenient for our users, as they have to navigate to settings themselves to download voices. Our app relies heavily on SpeechSynthesis integration, and it would greatly benefit from this feature. FB16307193
Replies
0
Boosts
1
Views
86
Activity
Jun ’25
Inquiries about AVPlayer's ABR switching logic and control APIs for HLS/LL-HLS
Hello, I am developing a custom player SDK using AVPlayer to support HLS and LL-HLS live streaming. I have some questions about the internal logic of AVPlayer regarding ABR, as this information is not explicitly covered in the documentation. ABR Switching Logic: Does AVPlayer trigger bitrate switching primarily based on stall occurrences (buffer starvation)? I am curious if the switching logic is reactive to stalls or if it proactively switches to prevent them based on throughput estimation. Developer Controls for ABR: To influence or control the ABR selection, are preferredPeakBitRate and preferredForwardBufferDuration the only properties available to developers? Are there any other recommended APIs to assist with ABR decisions? Thank you for your help.
Replies
2
Boosts
0
Views
515
Activity
Jan ’26
iOS Speech Error on Mobile Simulator (Error fetching voices)
I'm writing a simple app for iOS and I'd like to be able to do some text to speech in it. I have a basic audio manager class with a "speak" function: import Foundation import AVFoundation class AudioManager { static let shared = AudioManager() var audioPlayer: AVAudioPlayer? var isPlaying: Bool { return audioPlayer?.isPlaying ?? false } var playbackPosition: TimeInterval = 0 func playSound(named name: String) { guard let url = Bundle.main.url(forResource: name, withExtension: "mp3") else { print("Sound file not found") return } do { if audioPlayer == nil || !isPlaying { audioPlayer = try AVAudioPlayer(contentsOf: url) audioPlayer?.currentTime = playbackPosition audioPlayer?.prepareToPlay() audioPlayer?.play() } else { print("Sound is already playing") } } catch { print("Error playing sound: \(error.localizedDescription)") } } func stopSound() { if let player = audioPlayer { playbackPosition = player.currentTime player.stop() } } func speak(text: String) { let synthesizer = AVSpeechSynthesizer() let utterance = AVSpeechUtterance(string: text) utterance.voice = AVSpeechSynthesisVoice(language: "en-GB") synthesizer.speak(utterance) } } And my app shows text in a ScrollView: ScrollView { Text(self.description) .padding() .foregroundColor(.black) .font(.headline) .background(Color.gray.opacity(0)) }.onAppear { AudioManager.shared.speak(text: self.description) } However, the text doesn't get read out (in the simulator). I see some output in the console: Error fetching voices: Swift.DecodingError.dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "Invalid container metadata for _UnkeyedDecodingContainer, found keyedGraphEncodingNodeID", underlyingError: nil)). Using fallback voices. I'm probably doing something wrong here, but not sure what.
Replies
5
Boosts
1
Views
698
Activity
Dec ’25
AVAudioFile.read extremely slow after seeking in FLAC and MP3 files
I'm developing an audio player app that uses AVAudio​File to read PCM data from various formats. I'm experiencing severe performance issues when seeking in FLAC, while other compressed formats (M4A/AAC) work correctly. I don't intend to use them in my app, but I also tested mp3 files just by curiosity and they also have this issue. Environment: macOS 26 (Tahoe) Xcode 26.3 Apple Silicon (M1) The issue: After setting AVAudio​File​.frame​Position to a position mid-file, the subsequent call to AVAudio​File​.read(into​:frame​Count:) blocks for an unreasonable amount of time for FLAC and MP3 files. The delay scales linearly with the seek target, seeking near the beginning is fast, seeking toward the end is proportionally slower, which suggests the decoder is decoding linearly from the beginning of the file rather than using any seek index. (My app deals with “images” of Audio CDs ripped as a single long audio file.) The issue is particularly severe when reading files from an SMB network share (server on Ethernet, client on Wi-Fi with the access point ~2 meters away in line of sight). Quick Benchmark results: I tested with the same 75-minute audio content (16-bit/44.1 kHz stereo, 200,502,708 frames) encoded in five formats, seeking to the midpoint. Over SMB (Local Network, Server on Ethernet, Client on WiFi): Format | Seek + Read Time ----------|------------------ WAV | 0.007 s AIFF | 0.009 s Apple | 0.015 s Lossless | MP3 | 9.2 s FLAC | 30.2 s Locally (MacBook Air M1 SSD) : Format | Seek + Read Time ----------|------------------ WAV | 0.0005 s AIFF | 0.0004 s Apple | 0.0011 s Lossless | MP3 | 0.1958 s FLAC | 0.7528 s WAV, AIFF, and M4A all seek virtually instantly (< 15 ms). MP3 and FLAC exhibit linear-time behavior, with FLAC being the worst affected. Note that M4A (AAC) is also a compressed format that requires decoding after seeking, yet it completes in 15 ms. This rules out any inherent limitation of compressed formats, the MP4 container's packet index (stts/stco) is clearly being used for fast random access. Both MP3 (Xing/LAME TOC) and FLAC (SEEKTABLE metadata block) have their own seek mechanisms that should provide similar performance. Minimal CLI tool to reproduce: import Foundation guard CommandLine.arguments.count > 1 else { print("Usage: FLACSpeed <audio-file-path>") exit(1) } let path = CommandLine.arguments[1] let fileURL = URL(fileURLWithPath: path) do { let file = try AVAudioFile(forReading: fileURL) let format = file.processingFormat let buffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: 8192)! let totalFrames = file.length let seekTarget = totalFrames / 2 print("File: \(fileURL.lastPathComponent)") print("Format: \(format)") print("Total frames: \(totalFrames)") print("Seeking to frame: \(seekTarget)") file.framePosition = seekTarget let start = CFAbsoluteTimeGetCurrent() try file.read(into: buffer, frameCount: 8192) let elapsed = CFAbsoluteTimeGetCurrent() - start print("Read after seek took \(elapsed) seconds") } catch { print("Error: \(error.localizedDescription)") exit(1) } Expected behavior: AVAudio​File​.read(into​:frame​Count:) after setting frame​Position should use the available seek mechanisms in FLAC and MP3 files for fast random access, as it already does for M4A (AAC). Even accounting for the fact that seek tables provide approximate (not sample-precise) positioning, the "jump to nearest index point + decode forward" approach should complete in milliseconds, not seconds. Workaround: For FLAC, I've worked around this by using libFLAC directly, which provides instant seeking via FLAC__stream​_decoder​_seek​_absolute(). libFLAC Performance: For comparison, libFLAC's FLAC__stream​_decoder​_seek​_absolute() performs the same seek + read on the same FLAC file in around 0.015, using the FLAC seek table to jump to the nearest preceding seek point, then decoding forward a small number of frames to the exact target sample.
Replies
1
Boosts
1
Views
188
Activity
1d
AVAudioRecorder loses audio recorded before interruption
Hi everyone, I'm running into an issue with AVAudioRecorder when handling interruptions such as phone calls or alarms. Problem: When the app is recording audio and an interruption occurs: I handle the interruption with audioRecorder?.pause() inside AVAudioSession.interruptionNotification (on .began). On .ended, I check for .shouldResume and call audioRecorder?.record() again. The recorder resumes successfully, but only the audio recorded after the interruption is saved. The audio recorded before the interruption is lost, even though I'm using the same file URL and not recreating the recorder. Repro: Start a recording with AVAudioRecorder Simulate a system interruption (e.g., incoming call) Resume recording after the interruption Stop and inspect the output audio file Expected: Full audio (before and after interruption) should be saved. Actual: Only the audio after interruption is saved; the earlier part is missing Notes: According to the documentation, calling .record() after .pause() should resume recording into the same file. I confirmed that the file URL does not change, and I do not recreate the recorder instance. No error is thrown by the system during this process. This behavior happens consistently when the app is interrupted and resumed. Question: Is this a known issue? Is there a recommended workaround for preserving the full recording when interruptions happen? Thanks in advance!
Replies
1
Boosts
1
Views
393
Activity
Dec ’25
Orientation does not work on iPhone 17 and above.
I'm receiving output from avcapturesession and capturing an image using Vision, but the image is output in landscape orientation instead of portrait. Even when I set the orientation to up in ciimage, cgimage, and uiimage, the image is still output in landscape orientation. On iPhones 16 and below, the image is output in portrait orientation. But on iPhones 17 and above, the image is output in landscape orientation. Please help.
Replies
1
Boosts
1
Views
433
Activity
Dec ’25
Dockkit custom tracking does not work on iOS18.3
Hi all, I'm using Apple Sample Code below to create application using dockkit. "Controlling a DockKit accessory using your camera app" https://aninterestingwebsite.com/documentation/dockkit/controlling-a-dockkit-accessory-using-your-camera-app?changes=_8 I used vision hand recognition and put the observation data to dockAccessory.track, but Belkin or Insta360 devices never move on iPhone 16 Pro Max with iOS 18.3. If I use other functions like face search (system tracking) in the app, those work ok. I used Belkin and Insta360 Flow 2 Pro to reproduce the problem. My friend is also saying that the custom tracking feature was working fine on the OS 18 beta, but on recent iOS 18.3 that feature does not work. If I can get the iOS 18.0 beta then we can test that feature. But I cannot revert my iOS from 18.3 to the iOS 18.0 Beta. Regards, TO
Replies
1
Boosts
1
Views
348
Activity
Dec ’25
Offline Fairplay Error -42650
We have implemented offline Fairplay playback and it works fine.But at times when trying to playback the offline downloaded content, we get the following error"An unknown error occured (-42650)"Tried looking up the error in the documentation but couldnt find anything relevant.What could possibly be creating this error?
Replies
6
Boosts
2
Views
3.2k
Activity
4w
Swift Array Out of Bounds Crash in VTFrameProcessor when using VTLowLatencyFrameInterpolationParameters
Hi everyone, Our team is encountering a reproducible crash when using VTLowLatencyFrameInterpolation on iOS 26.3 while processing a live LL-HLS input stream. 🤖 Environment Device: iPhone 16 OS: iOS 26.3 Xcode: Xcode 26.3 Framework: VideoToolbox 💥 Crash Details The application crashes with the following fatal error: Fatal error: Swift/ContiguousArrayBuffer.swift:184: Array index out of range The stack trace highlights the following: VTLowLatencyFrameInterpolationImplementation processWithParameters:frameOutputHandler: Called from VTFrameProcessor.process(parameters:) Here is the simplified implementation block where the crash occurs. (Note: PrismSampleBuffer and PrismLLFIError are our internal custom wrapper types). // Create `VTFrameProcessorFrame` for the source (previous) frame. let sourcePTS = sourceSampleBuffer.presentationTimeStamp var sourceFrame: VTFrameProcessorFrame? if let pixelBuffer = sourceSampleBuffer.imageBuffer { sourceFrame = VTFrameProcessorFrame(buffer: pixelBuffer, presentationTimeStamp: sourcePTS) } // Validate the source VTFrameProcessorFrame. guard let sourceFrame else { throw PrismLLFIError.missingImageBuffer } // Create `VTFrameProcessorFrame` for the next frame. let nextPTS = nextSampleBuffer.presentationTimeStamp var nextFrame: VTFrameProcessorFrame? if let pixelBuffer = nextSampleBuffer.imageBuffer { nextFrame = VTFrameProcessorFrame(buffer: pixelBuffer, presentationTimeStamp: nextPTS) } // Validate the next VTFrameProcessorFrame. guard let nextFrame else { throw PrismLLFIError.missingImageBuffer } // Calculate interpolation intervals and allocate destination frame buffers. let intervals = interpolationIntervals() let destinationFrames = try framesBetween(firstPTS: sourcePTS, lastPTS: nextPTS, interpolationIntervals: intervals) let interpolationPhase: [Float] = intervals.map { Float($0) } // Create VTLowLatencyFrameInterpolationParameters. // This sets up the configuration required for temporal frame interpolation between the previous and current source frames. guard let parameters = VTLowLatencyFrameInterpolationParameters( sourceFrame: nextFrame, previousFrame: sourceFrame, interpolationPhase: interpolationPhase, destinationFrames: destinationFrames ) else { throw PrismLLFIError.failedToCreateParameters } try await send(sourceSampleBuffer) // Process the frames. // Using progressive callback here to get the next processed frame as soon as it's ready, // preventing the system from waiting for the entire batch to finish. for try await readOnlyFrame in self.frameProcessor.process(parameters: parameters) { // Create an interpolated sample buffer based on the output frame. let newSampleBuffer: PrismSampleBuffer = try readOnlyFrame.frame.withUnsafeBuffer { pixelBuffer in try PrismLowLatencyFrameInterpolation.createSampleBuffer(from: pixelBuffer, readOnlyFrame.timeStamp) } // Pass the newly generated frame to the output stream. try await send(newSampleBuffer) } 🙋 Questions Are there any known limitations or bugs regarding VTLowLatencyFrameInterpolation when handling live 60fps streams? Are there any undocumented constraints we should be aware of regarding source/previous frame timing, pixel buffer attributes, or how destinationFrames and interpolationPhase arrays must be allocated? Is a "warm-up" sequence recommended after startSession() before making the first process(parameters:) call?
Replies
1
Boosts
0
Views
472
Activity
3w
MusicKit - ApplicationMusicPlayer fails to play certain Songs
[Note: this issue was happening on a main testing device, and after testing the same code on other devices, this issue is only happening on 1 out of 4 devices] We are successfully getting a MusicCatalogResourceResponse for every song ID where we make the MusicCatalogResourceRequest. We are able to display the song title, artist name, and album artwork for each Song in the response. However - when we go to play the song, there are some songs that play, and several songs that do not play. For the songs that don't play, the console shows “Failed to prepareToPlay error=<MPMusicPlayerControllerErrorDomain.6 "Failed to prepare to play" {}>” let musicPlayer = ApplicationMusicPlayer.shared func playSong(_ song: Song) { musicPlayer.queue = [song] Task { try await musicPlayer.prepareToPlay() try await musicPlayer.play() } Is there anything else we can investigate about what may be causing specific song IDs not to play on this specific device? Even if we remove line 6 musicPlayer.prepareToPlay() we still see the same console error when running playSong with the Songs that don't work. It is always the same song IDs that we can play and always the same song IDs that we cannot get to play, even trying them across different projects with different bundle identifiers. We can tap to play a song that works, and it starts playing immediately. Then tap a song that doesn't work, and nothing happens. Then back to a song that works. It's consistent which songs succeed and fail on this device. Perhaps there is an issue specific to this very iPad when it comes to certain specific songs, but we'd like to be confident that an app relying on MusicKit will be able to play songs that have been successfully loaded with a MusicCatalogResourceResponse. Thanks for any help or suggestions about what we may be able to investigate further on the device or what we should consider when launching an app that expects anyone with Apple Music to be able to listen to any of the songs loaded by the app. Specific iPad details: iPad Pro (12.9-inch) (6th generation) running iPadOS 26.4 Beta Two of the song IDs that won't play on this iPad (even though we can access and display their album artwork and all other information): 943204000 and 1441164805
Replies
2
Boosts
1
Views
216
Activity
Mar ’26
Apple Music playlist create/delete works but DELETE returns 401 — and MusicKit write APIs are macOS‑unavailable. How to build a playlist editor on macOS?
I’m trying to build a playlist editor on macOS. I can create playlists via the Apple Music HTTP API, but DELETE always returns 401 even immediately after creation with the same tokens. Minimal repro: #!/usr/bin/env bash set -euo pipefail BASE_URL="https://api.music.apple.com/v1" PLAYLIST_NAME="${PLAYLIST_NAME:-blah}" : "${APPLE_MUSIC_DEV_TOKEN:?}" : "${APPLE_MUSIC_USER_TOKEN:?}" create_body="$(mktemp)" delete_body="$(mktemp)" trap 'rm -f "$create_body" "$delete_body"' EXIT curl -sS --compressed -o "$create_body" -w "Create status: %{http_code}\n" \ -X POST "${BASE_URL}/me/library/playlists" \ -H "Authorization: Bearer ${APPLE_MUSIC_DEV_TOKEN}" \ -H "Music-User-Token: ${APPLE_MUSIC_USER_TOKEN}" \ -H "Content-Type: application/json" \ -d "{\"attributes\":{\"name\":\"${PLAYLIST_NAME}\"}}" playlist_id="$(python3 - "$create_body" <<'PY' import json, sys with open(sys.argv[1], "r", encoding="utf-8") as f: data = json.load(f) print(data["data"][0]["id"]) PY )" curl -sS --compressed -o "$delete_body" -w "Delete status: %{http_code}\n" \ -X DELETE "${BASE_URL}/me/library/playlists/${playlist_id}" \ -H "Authorization: Bearer ${APPLE_MUSIC_DEV_TOKEN}" \ -H "Music-User-Token: ${APPLE_MUSIC_USER_TOKEN}" \ -H "Content-Type: application/json" I capture the response bodies like this: cat "$create_body" cat "$delete_body" Result: Create: 201 Delete: 401 I also checked the latest macOS SDK’s MusicKit interfaces and MusicLibrary.createPlaylist/edit/add(to:) are marked @available(macOS, unavailable), so I can’t create/ delete via MusicKit on macOS either. Question: How can I implement a playlist editor on macOS (create/delete/modify) if: MusicKit write APIs are unavailable on macOS, and The HTTP API can create but DELETE returns 401? Any guidance or official workaround would be hugely appreciated.
Replies
0
Boosts
1
Views
331
Activity
Jan ’26
Changing Frame Rate of External Display on iPad
Hello, As far as I know and in all of my testing there is no way for a user or a developer to change the frame rate of the video output on iPadOS. If you connect an iPad via a USB Hub or a USB to HDMI Adaptor and then connect it to an external monitor it will output at 59.94fps. I have a video app where a user monitors live video at 25fps and 30fps, they often output to an external display and there are times when the external display will stutter due to the mismatch in frame rate, ie. using 25fps and outputting at 59.94fps. I thought it was impossible to change the video output frame rate, then in V3.1 of the Blackmagic Camera App I saw an interesting change in their release notes: ‘Support for HDMI Monitoring at Sensor Rate and Resolution’ This means there is some way to modify it, not sure if this is done via a Private API that Apple has allowed Blackmagic to use. If so, how can we access this or is there a way to enable this that is undocumented? Thanks!
Replies
6
Boosts
0
Views
1k
Activity
Jan ’26
Start and stop recording Voice Memos with Siri
using iOS 26.2; Airpods 4 Long press stem to launch Siri Speak "Record Voice Memo" -> Recording starts Recording in progress... Long press stem to launch Siri -> Nothing happens. To stop recording need use phone. is this intended behaviour? i would like to be able to stop recording with Siri I am able to launch Siri from phone while recording, but point is to keep phone in pocket and start/stop recordings only via Airpods.
Replies
1
Boosts
0
Views
189
Activity
Dec ’25
AVCaptureMetadataOutput .face detection not working on iOS 26 Beta with high sessionPreset
In iOS 26 (Developer Beta), the AVCaptureMetadataOutputObjectsDelegate no longer receives callbacks when metadataOutput.metadataObjectTypes = [.face] is set. On earlier iOS versions the issue does not occur. Interestingly, face detection works if I set the sessionPreset to .medium, but not with .high — except on the iPhone 16 Pro Max, where it works regardless.
Replies
2
Boosts
1
Views
370
Activity
Sep ’25
How to disable and enable build-in camera
Hi all, In MacOS, how can I disable or enable build-in camera by program or script?
Replies
1
Boosts
0
Views
165
Activity
Apr ’25
Background uploading extension's identifier not official yet
Recently Apple gave us the possibility to upload asset resources in the background. We implemented our background upload extension but when our CI tried to upload the app on TestFlight we got an error that the extension point identifier - in our case com.apple.photos.backgound-upload - is not an official one. Any idea when it will become official and we will be able to release a working background uploading?
Replies
1
Boosts
0
Views
213
Activity
Nov ’25
DockKit gimbal reported yaw drifts by upwards of 45 degrees after running for a while
This is an issue with the Insta360 Flow Pro 2. My iOS app uses DockKit to control the gimbal; in particular, my app disables tracking and sends angular velocity commands to control the gimbal's orientation. I only try to modify the yaw (rotation around the vertical axis); never the pitch or yaw. Note that I don't send the gimbal to a particular orientation directly; I modify the velocity. Everything works great for a long period of time: typically for a continuous run of 4-6 hours; in the most recent case, I managed about 36 hours of continous operation before the following problem occurred. I came back to check on the system, and because no visual activity had occurred in the camera's field of view for a while, the phone had commanded the gimbal to rotate back to a yaw angle of 0 degrees. So the phone in the gimbal should have been looking straight ahead (i.e. the 0 degree yaw position), but it was definitely looking off at an angle. I've seen this twice now. The first time, when it should have been looking straight ahead, it was in fact looking 60 degrees off center. This time (caught on video, see below), it was off by 22 degrees from center. Here's the weird part: the gimbal reports this way off center positioning as zero degrees (well close enough to zero, like 0.2 or something that's fine). But, mechanically, the gimbal still knows where zero degrees is: if we double click on the trigger of the Flow Pro 2, which is supposed to reset the gimbal to 0 degrees yaw and pitch, the gimbal responds correctly and reorients to a 0 degree position. However, the yaw values it reports are not zero, but as shown in my video, 22 degrees off axis or so. Power cycling the gimbal and restarting immediately fixes the problem. Also, I switched from my app to the Insta360 app, which caused the phone to flip from landscape to portrait, then when I returned to my app and switched back to landscape, the gimbal now started reporting correct yaw angles. Is there a possibility this is a bug in the DockKit framework? Has anyone seen this? I have a case open with Insta360, but although it's clearly a software issue, it's not clear if it's in Insta360's code or the DockKit layer. Any ideas for how I can get out of this mode? My concern is that the phone is in a tripod about 10' off the floor, and not very accessible. Also, if all goes well, we may have about 50 of these systems running, and having to fix them one by one after a few hours is not good. For a demonstration of this bug, see the following video: https://octoparry.com/offset.MOV Any help greatly appreciated.
Replies
4
Boosts
0
Views
565
Activity
Jun ’25
[26] audioTimeRange would still be interesting for .volatileResults in SpeechTranscriber
So experimenting with the new SpeechTranscriber, if I do: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange] ) only the final result has audio time ranges, not the volatile results. Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses. I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while. The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
Replies
6
Boosts
0
Views
763
Activity
Nov ’25