使用AudioConverter进行AAC编码并写入AVAssetWriter
我努力使用AudioConverter
对从AVCaptureSession
接收到的音频缓冲区进行编码,然后将它们附加到AVAssetWriter
。
我没有收到任何错误(包括OSStatus响应),并且生成的CMSampleBuffer
似乎有有效的数据,但是生成的文件根本没有任何可播放的音频。 当与视频一起写入时,视频帧不再附加几帧( appendSampleBuffer()
返回false,但没有AVAssetWriter.error
),可能是因为资产AVAssetWriter.error
正在等待音频追上。 我怀疑这与我为AAC设定启动的方式有关。
该应用程序使用RxSwift,但我已经删除了RxSwift部分,以便更容易理解更广泛的受众群体。
请查看下面代码中的评论以获取更多...评论
给定一个设置结构:
import Foundation
import AVFoundation
import CleanroomLogger
public struct AVSettings {
let orientation: AVCaptureVideoOrientation = .Portrait
let sessionPreset = AVCaptureSessionPreset1280x720
let videoBitrate: Int = 2_000_000
let videoExpectedFrameRate: Int = 30
let videoMaxKeyFrameInterval: Int = 60
let audioBitrate: Int = 32 * 1024
/// Settings that are `0` means variable rate.
/// The `mSampleRate` and `mChennelsPerFrame` is overwritten at run-time
/// to values based on the input stream.
let audioOutputABSD = AudioStreamBasicDescription(
mSampleRate: AVAudioSession.sharedInstance().sampleRate,
mFormatID: kAudioFormatMPEG4AAC,
mFormatFlags: UInt32(MPEG4ObjectID.AAC_Main.rawValue),
mBytesPerPacket: 0,
mFramesPerPacket: 1024,
mBytesPerFrame: 0,
mChannelsPerFrame: 1,
mBitsPerChannel: 0,
mReserved: 0)
let audioEncoderClassDescriptions = [
AudioClassDescription(
mType: kAudioEncoderComponentType,
mSubType: kAudioFormatMPEG4AAC,
mManufacturer: kAppleSoftwareAudioCodecManufacturer) ]
}
一些辅助功能:
public func getVideoDimensions(fromSettings settings: AVSettings) -> (Int, Int) {
switch (settings.sessionPreset, settings.orientation) {
case (AVCaptureSessionPreset1920x1080, .Portrait): return (1080, 1920)
case (AVCaptureSessionPreset1280x720, .Portrait): return (720, 1280)
default: fatalError("Unsupported session preset and orientation")
}
}
public func createAudioFormatDescription(fromSettings settings: AVSettings) -> CMAudioFormatDescription {
var result = noErr
var absd = settings.audioOutputABSD
var description: CMAudioFormatDescription?
withUnsafePointer(&absd) { absdPtr in
result = CMAudioFormatDescriptionCreate(nil,
absdPtr,
0, nil,
0, nil,
nil,
&description)
}
if result != noErr {
Log.error?.message("Could not create audio format description")
}
return description!
}
public func createVideoFormatDescription(fromSettings settings: AVSettings) -> CMVideoFormatDescription {
var result = noErr
var description: CMVideoFormatDescription?
let (width, height) = getVideoDimensions(fromSettings: settings)
result = CMVideoFormatDescriptionCreate(nil,
kCMVideoCodecType_H264,
Int32(width),
Int32(height),
[:],
&description)
if result != noErr {
Log.error?.message("Could not create video format description")
}
return description!
}
这是资产编写者如何初始化的:
guard let audioDevice = defaultAudioDevice() else
{ throw RecordError.MissingDeviceFeature("Microphone") }
guard let videoDevice = defaultVideoDevice(.Back) else
{ throw RecordError.MissingDeviceFeature("Camera") }
let videoInput = try AVCaptureDeviceInput(device: videoDevice)
let audioInput = try AVCaptureDeviceInput(device: audioDevice)
let videoFormatHint = createVideoFormatDescription(fromSettings: settings)
let audioFormatHint = createAudioFormatDescription(fromSettings: settings)
let writerVideoInput = AVAssetWriterInput(mediaType: AVMediaTypeVideo,
outputSettings: nil,
sourceFormatHint: videoFormatHint)
let writerAudioInput = AVAssetWriterInput(mediaType: AVMediaTypeAudio,
outputSettings: nil,
sourceFormatHint: audioFormatHint)
writerVideoInput.expectsMediaDataInRealTime = true
writerAudioInput.expectsMediaDataInRealTime = true
let url = NSURL(fileURLWithPath: NSTemporaryDirectory(), isDirectory: true)
.URLByAppendingPathComponent(NSProcessInfo.processInfo().globallyUniqueString)
.URLByAppendingPathExtension("mp4")
let assetWriter = try AVAssetWriter(URL: url, fileType: AVFileTypeMPEG4)
if !assetWriter.canAddInput(writerVideoInput) {
throw RecordError.Unknown("Could not add video input") }
if !assetWriter.canAddInput(writerAudioInput) {
throw RecordError.Unknown("Could not add audio input") }
assetWriter.addInput(writerVideoInput)
assetWriter.addInput(writerAudioInput)
这就是音频采样如何编码,问题区域最有可能在这里。 我重写了这个,以便它不使用任何Rx主义。
var outputABSD = settings.audioOutputABSD
var outputFormatDescription: CMAudioFormatDescription! = nil
CMAudioFormatDescriptionCreate(nil, &outputABSD, 0, nil, 0, nil, nil, &formatDescription)
var converter: AudioConverter?
// Indicates whether priming information has been attached to the first buffer
var primed = false
func encodeAudioBuffer(settings: AVSettings, buffer: CMSampleBuffer) throws -> CMSampleBuffer? {
// Create the audio converter if it's not available
if converter == nil {
var classDescriptions = settings.audioEncoderClassDescriptions
var inputABSD = CMAudioFormatDescriptionGetStreamBasicDescription(CMSampleBufferGetFormatDescription(buffer)!).memory
var outputABSD = settings.audioOutputABSD
outputABSD.mSampleRate = inputABSD.mSampleRate
outputABSD.mChannelsPerFrame = inputABSD.mChannelsPerFrame
var converter: AudioConverterRef = nil
var result = noErr
result = withUnsafePointer(&outputABSD) { outputABSDPtr in
return withUnsafePointer(&inputABSD) { inputABSDPtr in
return AudioConverterNewSpecific(inputABSDPtr,
outputABSDPtr,
UInt32(classDescriptions.count),
&classDescriptions,
&converter)
}
}
if result != noErr { throw RecordError.Unknown }
// At this point I made an attempt to retrieve priming info from
// the audio converter assuming that it will give me back default values
// I can use, but ended up with `nil`
var primeInfo: AudioConverterPrimeInfo? = nil
var primeInfoSize = UInt32(sizeof(AudioConverterPrimeInfo))
// The following returns a `noErr` but `primeInfo` is still `nil``
AudioConverterGetProperty(converter,
kAudioConverterPrimeInfo,
&primeInfoSize,
&primeInfo)
// I've also tried to set `kAudioConverterPrimeInfo` so that it knows
// the leading frames that are being primed, but the set didn't seem to work
// (`noErr` but getting the property afterwards still returned `nil`)
}
let converter = converter!
// Need to give a big enough output buffer.
// The assumption is that it will always be <= to the input size
let numSamples = CMSampleBufferGetNumSamples(buffer)
// This becomes 1024 * 2 = 2048
let outputBufferSize = numSamples * Int(inputABSD.mBytesPerPacket)
let outputBufferPtr = UnsafeMutablePointer<Void>.alloc(outputBufferSize)
defer {
outputBufferPtr.destroy()
outputBufferPtr.dealloc(1)
}
var result = noErr
var outputPacketCount = UInt32(1)
var outputData = AudioBufferList(
mNumberBuffers: 1,
mBuffers: AudioBuffer(
mNumberChannels: outputABSD.mChannelsPerFrame,
mDataByteSize: UInt32(outputBufferSize),
mData: outputBufferPtr))
// See below for `EncodeAudioUserData`
var userData = EncodeAudioUserData(inputSampleBuffer: buffer,
inputBytesPerPacket: inputABSD.mBytesPerPacket)
withUnsafeMutablePointer(&userData) { userDataPtr in
// See below for `fetchAudioProc`
result = AudioConverterFillComplexBuffer(
converter,
fetchAudioProc,
userDataPtr,
&outputPacketCount,
&outputData,
nil)
}
if result != noErr {
Log.error?.message("Error while trying to encode audio buffer, code: (result)")
return nil
}
// See below for `CMSampleBufferCreateCopy`
guard let newBuffer = CMSampleBufferCreateCopy(buffer,
fromAudioBufferList: &outputData,
newFromatDescription: outputFormatDescription) else {
Log.error?.message("Could not create sample buffer from audio buffer list")
return nil
}
if !primed {
primed = true
// Simply picked 2112 samples based on convention, is there a better way to determine this?
let samplesToPrime: Int64 = 2112
let samplesPerSecond = Int32(settings.audioOutputABSD.mSampleRate)
let primingDuration = CMTimeMake(samplesToPrime, samplesPerSecond)
// Without setting the attachment the asset writer will complain about the
// first buffer missing the `TrimDurationAtStart` attachment, is there are way
// to infer the value from the given `AudioBufferList`?
CMSetAttachment(newBuffer,
kCMSampleBufferAttachmentKey_TrimDurationAtStart,
CMTimeCopyAsDictionary(primingDuration, nil),
kCMAttachmentMode_ShouldNotPropagate)
}
return newBuffer
}
以下是为音频转换器提取样本的过程以及传递给它的数据结构:
private class EncodeAudioUserData {
var inputSampleBuffer: CMSampleBuffer?
var inputBytesPerPacket: UInt32
init(inputSampleBuffer: CMSampleBuffer,
inputBytesPerPacket: UInt32) {
self.inputSampleBuffer = inputSampleBuffer
self.inputBytesPerPacket = inputBytesPerPacket
}
}
private let fetchAudioProc: AudioConverterComplexInputDataProc = {
(inAudioConverter,
ioDataPacketCount,
ioData,
outDataPacketDescriptionPtrPtr,
inUserData) in
var result = noErr
if ioDataPacketCount.memory == 0 { return noErr }
let userData = UnsafeMutablePointer<EncodeAudioUserData>(inUserData).memory
// If its already been processed
guard let buffer = userData.inputSampleBuffer else {
ioDataPacketCount.memory = 0
return -1
}
var inputBlockBuffer: CMBlockBuffer?
var inputBufferList = AudioBufferList()
result = CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
buffer,
nil,
&inputBufferList,
sizeof(AudioBufferList),
nil,
nil,
0,
&inputBlockBuffer)
if result != noErr {
Log.error?.message("Error while trying to retrieve buffer list, code: (result)")
ioDataPacketCount.memory = 0
return result
}
let packetsCount = inputBufferList.mBuffers.mDataByteSize / userData.inputBytesPerPacket
ioDataPacketCount.memory = packetsCount
ioData.memory.mBuffers.mNumberChannels = inputBufferList.mBuffers.mNumberChannels
ioData.memory.mBuffers.mDataByteSize = inputBufferList.mBuffers.mDataByteSize
ioData.memory.mBuffers.mData = inputBufferList.mBuffers.mData
if outDataPacketDescriptionPtrPtr != nil {
outDataPacketDescriptionPtrPtr.memory = nil
}
return noErr
}
这就是我将AudioBufferList
转换为CMSampleBuffer
的方式:
public func CMSampleBufferCreateCopy(
buffer: CMSampleBuffer,
inout fromAudioBufferList bufferList: AudioBufferList,
newFromatDescription formatDescription: CMFormatDescription? = nil)
-> CMSampleBuffer? {
var result = noErr
var sizeArray: [Int] = [Int(bufferList.mBuffers.mDataByteSize)]
// Copy timing info from the previous buffer
var timingInfo = CMSampleTimingInfo()
result = CMSampleBufferGetSampleTimingInfo(buffer, 0, &timingInfo)
if result != noErr { return nil }
var newBuffer: CMSampleBuffer?
result = CMSampleBufferCreateReady(
kCFAllocatorDefault,
nil,
formatDescription ?? CMSampleBufferGetFormatDescription(buffer),
Int(bufferList.mNumberBuffers),
1, &timingInfo,
1, &sizeArray,
&newBuffer)
if result != noErr { return nil }
guard let b = newBuffer else { return nil }
CMSampleBufferSetDataBufferFromAudioBufferList(b, nil, nil, 0, &bufferList)
return newBuffer
}
有什么我明显做错了吗? 有没有从AudioBufferList
构造CMSampleBuffer
的正确方法? 如何将转换器的启动信息传输到您创建的CMSampleBuffer
?
对于我的用例,我需要手动执行编码,因为缓冲区将在管道的下方进行处理(尽管为了确保它能正常工作,我已禁用了编码之后的所有转换。)
任何帮助将非常感激。 对不起,有太多的代码需要消化,但我想尽可能地提供更多的上下文。
提前致谢 :)
一些相关的问题:
我用过的一些参考文献:
AudioConverter
Apple示例代码 结果发现我做错了很多事情。 我不会发布一段代码,而是试着将它组织成一些我发现的小东西。
样本vs数据包与帧
这对我来说是一个巨大的困惑:
CMSampleBuffer
可以有一个或多个样本缓冲区(通过CMSampleBufferGetNumSamples
发现) CMSampleBuffer
代表一个音频数据包 。 CMSampleBufferGetNumSamples(sample)
将返回给定缓冲区中包含的数据包数量。 AudioStreamBasicDescription
的mFramesPerPacket
属性来控制。 对于线性PCM缓冲器,每个采样缓冲器的总大小为每frames * bytes per frame
。 对于压缩缓冲区(如AAC),总大小和帧数之间没有关系。 AudioConverterComplexInputDataProc
此回调用于检索更多线性PCM音频数据进行编码。 必须至少提供由ioNumberDataPackets
指定的数据包数量。 由于我一直在使用转换器进行实时推式编码,因此我需要确保每次数据推送都包含最少量的数据包。 像这样的东西(伪代码):
let minimumPackets = outputFramesPerPacket / inputFramesPerPacket
var buffers: [CMSampleBuffer] = []
while getTotalSize(buffers) < minimumPackets {
buffers = buffers + [getNextBuffer()]
}
AudioConverterFillComplexBuffer(...)
切片CMSampleBuffer
的
如果它们包含多个缓冲区,您实际上可以对CMSampleBuffer
切片。 这样做的工具是CMSampleBufferCopySampleBufferForRange
。 这很好,您可以向AudioConverterComplexInputDataProc
提供请求的确切数据包数量,这使得处理最终编码缓冲区的定时信息变得更加容易。 因为如果您在转换器期望1024
时向转换器提供1500
帧数据,则结果采样缓冲区的持续时间将为1024/sampleRate
,而不是1500/sampleRate
。
启动和修剪持续时间
在进行AAC编码时,您必须像这样设置修剪持续时间:
CMSetAttachment(buffer,
kCMSampleBufferAttachmentKey_TrimDurationAtStart,
CMTimeCopyAsDictionary(primingDuration, kCFAllocatorDefault),
kCMAttachmentMode_ShouldNotPropagate)
我做错了一件事是我在编码时添加了修剪持续时间。 这应该由您的作家处理,以便它可以保证信息被添加到您的主要音频帧。
此外, kCMSampleBufferAttachmentKey_TrimDurationAtStart
的值不应大于采样缓冲区的持续时间。 启动的一个例子:
2112
44100
2112 / 44100 = ~0.0479s
1024
,引导持续时间: 1024 / 44100
1024
,引导持续时间: 1088 / 41100
创建新的CMSampleBuffer
AudioConverterFillComplexBuffer
有一个可选的outputPacketDescriptionsPtr
。 你应该使用它。 它将指向包含样本大小信息的新数据包描述数组。 您需要此样本大小信息来构建新的压缩样本缓冲区:
let bufferList: AudioBufferList
let packetDescriptions: [AudioStreamPacketDescription]
var newBuffer: CMSampleBuffer?
CMAudioSampleBufferCreateWithPacketDescriptions(
kCFAllocatorDefault, // allocator
nil, // dataBuffer
false, // dataReady
nil, // makeDataReadyCallback
nil, // makeDataReadyRefCon
formatDescription, // formatDescription
Int(bufferList.mNumberBuffers), // numSamples
CMSampleBufferGetPresentationTimeStamp(buffer), // sbufPTS (first PTS)
&packetDescriptions, // packetDescriptions
&newBuffer)
链接地址: http://www.djcxy.com/p/72051.html
上一篇: AAC encoding using AudioConverter and writing to AVAssetWriter