AAC(Advanced Audio Coding)镰禾,中文名:高級音頻編碼羡亩,出現(xiàn)于1997年,基于MPEG-2的音頻編碼技術(shù)状勤。由Fraunhofer IIS鞋怀、杜比實驗室、AT&T持搜、Sony等公司共同開發(fā)密似,目的是取代MP3格式。2000年葫盼,MPEG-4標準出現(xiàn)后残腌,AAC重新集成了其特性,加入了SBR技術(shù)和PS技術(shù)贫导,為了區(qū)別于傳統(tǒng)的MPEG-2 AAC又稱為MPEG-4 AAC抛猫。
iOS平臺支持AAC編碼器,主要使用AudioToolbox中的AudioConverter API孩灯。之所以做AAC編碼器是因為在做一個HLS的功能闺金,HLS要求的TS文件,需要視頻采用H264編碼钱反,音頻采用AAC編碼掖看。H264可以使用硬件或軟件編碼器,前面已經(jīng)介紹面哥。AAC也可以使用硬件或者軟件編碼哎壳,iOS全都支持。
首先需要創(chuàng)建一個Converter尚卫,也就是一個AAC Encoder归榕,使用如下接口:
extern OSStatus
AudioConverterNew(? ? ? const AudioStreamBasicDescription*? inSourceFormat,
const AudioStreamBasicDescription*? inDestinationFormat,
AudioConverterRef*? ? ? ? ? ? ? ? ? outAudioConverter)? ? ? __OSX_AVAILABLE_STARTING(__MAC_10_1,__IPHONE_2_0);
輸入?yún)?shù)分別是源和目的的數(shù)據(jù)格式。
在AAC編碼的場景下吱涉,源格式就是采集到的PCM數(shù)據(jù)刹泄,目的格式就是AAC。
AudioStreamBasicDescription inAudioStreamBasicDescription;
//? ? FillOutASBDForLPCM()
inAudioStreamBasicDescription.mFormatID = kAudioFormatLinearPCM;
inAudioStreamBasicDescription.mSampleRate = 44100;
inAudioStreamBasicDescription.mBitsPerChannel = 16;
inAudioStreamBasicDescription.mFramesPerPacket = 1;
inAudioStreamBasicDescription.mBytesPerFrame = 2;
inAudioStreamBasicDescription.mBytesPerPacket = inAudioStreamBasicDescription.mBytesPerFrame * inAudioStreamBasicDescription.mFramesPerPacket;
inAudioStreamBasicDescription.mChannelsPerFrame = 1;
inAudioStreamBasicDescription.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsNonInterleaved;
inAudioStreamBasicDescription.mReserved = 0;
AudioStreamBasicDescription outAudioStreamBasicDescription = {0}; // Always initialize the fields of a new audio stream basic description structure to zero, as shown here: ...
outAudioStreamBasicDescription.mChannelsPerFrame = 1;
outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC;
UInt32 size = sizeof(outAudioStreamBasicDescription);
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &outAudioStreamBasicDescription);
OSStatus status = AudioConverterNew(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, &_audioConverter);
if(status != 0) {NSLog(@"setup converter failed: %d", (int)status);}
這樣就創(chuàng)建了AAC編碼器怎爵,默認情況下特石,Apple會創(chuàng)建一個硬件編碼器,如果硬件不可用鳖链,會創(chuàng)建軟件編碼器姆蘸。
經(jīng)過我的測試,硬件AAC編碼器的編碼時延很高,需要buffer大約2秒的數(shù)據(jù)才會開始編碼逞敷。而軟件編碼器的編碼時延就是正常的狂秦,只要喂給1024個樣點,就會開始編碼推捐。
那么如何在創(chuàng)建的時候指定使用軟件編碼器呢裂问?需要用到下面的接口:
- (AudioClassDescription *)getAudioClassDescriptionWithType:(UInt32)type
fromManufacturer:(UInt32)manufacturer
{
static AudioClassDescription desc;
UInt32 encoderSpecifier = type;
OSStatus st;
UInt32 size;
st = AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size);
if (st) {
NSLog(@"error getting audio format propery info: %d", (int)(st));
return nil;
}
unsigned int count = size / sizeof(AudioClassDescription);
AudioClassDescription descriptions[count];
st = AudioFormatGetProperty(kAudioFormatProperty_Encoders,
sizeof(encoderSpecifier),
&encoderSpecifier,
&size,
descriptions);
if (st) {
NSLog(@"error getting audio format propery: %d", (int)(st));
return nil;
}
for (unsigned int i = 0; i < count; i++) {
if ((type == descriptions[i].mSubType) &&
(manufacturer == descriptions[i].mManufacturer)) {
memcpy(&desc, &(descriptions[i]), sizeof(desc));
return &desc;
}
}
return nil;
}
AudioClassDescription *desc = [self getAudioClassDescriptionWithType:kAudioFormatMPEG4AAC
fromManufacturer:kAppleSoftwareAudioCodecManufacturer];
OSStatus status = AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, desc, &_audioConverter);
如果要正確的編碼,編碼碼率參數(shù)是必須設置的牛柒。否則編碼時會返回560226676錯誤碼(!dat)堪簿。
UInt32 ulBitRate = 64000;
UInt32 ulSize = sizeof(ulBitRate);
status = AudioConverterSetProperty(_audioConverter, kAudioConverterEncodeBitRate, ulSize, &ulBitRate);
需要注意,AAC并不是隨便的碼率都可以支持焰络。比如如果PCM采樣率是44100KHz戴甩,那么碼率可以設置64000bps符喝,如果是16K闪彼,可以設置為32000bps。
創(chuàng)建完成Converter和設置完Bitrate之后协饲,可以查詢一下最大編碼輸出的大小畏腕,后續(xù)會用到。
UInt32 value = 0;
size = sizeof(value);
AudioConverterGetProperty(_audioConverter, kAudioConverterPropertyMaximumOutputPacketSize, &size, &value);
獲取出來的Value表示編碼器最大輸出的包大小茉稠。
然后調(diào)用AudioConverterFillCOmplexBuffer進行編碼:
AudioBufferList outAudioBufferList = {0};
outAudioBufferList.mNumberBuffers = 1;
outAudioBufferList.mBuffers[0].mNumberChannels = 1;
outAudioBufferList.mBuffers[0].mDataByteSize = value;//value是上面查詢到的值
outAudioBufferList.mBuffers[0].mData = new int8[value];
UInt32 ioOutputDataPacketSize = 1;
status = AudioConverterFillComplexBuffer(_audioConverter, inInputDataProc, (__bridge void *)(self), &ioOutputDataPacketSize, &outAudioBufferList, NULL);
編碼接口中描馅,inInputDataProc是一個輸入數(shù)據(jù)的回調(diào)函數(shù)。用來喂PCM數(shù)據(jù)給Converter而线,ioOutputDataPacketSize為1表示編碼產(chǎn)生1幀數(shù)據(jù)即返回铭污。outAudioBufferList用來存放編碼后的數(shù)據(jù)。
inInputDataProc中的處理如下:
static OSStatus inInputDataProc(AudioConverterRef inAudioConverter, UInt32 *ioNumberDataPackets, AudioBufferList *ioData, AudioStreamPacketDescription **outDataPacketDescription, void *inUserData)
{
AACEncoder *encoder = (__bridge AACEncoder *)(inUserData);
UInt32 requestedPackets = *ioNumberDataPackets;
uint8_t *buffer;
uint32_t bufferLength = requestedPackets * 2;
uint32_t bufferRead;
bufferRead = [encoder.pcmPool readBuffer:&buffer withLength:bufferLength];
if (bufferRead == 0) {
*ioNumberDataPackets = 0;
return -1;
}
ioData->mBuffers[0].mData = buffer;
ioData->mBuffers[0].mDataByteSize = bufferRead;
ioData->mNumberBuffers = 1;
ioData->mBuffers[0].mNumberChannels = 1;
*ioNumberDataPackets = bufferRead >> 1;
return noErr;
}
pcmPool是一個用于存放PCM數(shù)據(jù)的環(huán)形緩沖區(qū)膀篮。
因為采集輸入每次不一定有1024樣點嘹狞,所以可以將數(shù)據(jù)緩存起來,再滿足1024樣點時再調(diào)用編碼誓竿。
另外磅网,對于TS文件來說,每個AAC數(shù)據(jù)需要增加一個adts頭筷屡,adts頭是一個7bit的數(shù)據(jù)涧偷,通過adts可以得知AAC數(shù)據(jù)的編碼參數(shù),方便解碼器進行解碼毙死。
adts頭的計算方法如下:
- (NSData*) adtsDataForPacketLength:(NSUInteger)packetLength {
int adtsLength = 7;
char *packet = (char *)malloc(sizeof(char) * adtsLength);
// Variables Recycled by addADTStoPacket
int profile = 2;? //AAC LC
//39=MediaCodecInfo.CodecProfileLevel.AACObjectELD;
int freqIdx = 8;? //16KHz
int chanCfg = 1;? //MPEG-4 Audio Channel Configuration. 1 Channel front-center
NSUInteger fullLength = adtsLength + packetLength;
// fill in ADTS data
packet[0] = (char)0xFF; // 11111111? = syncword
packet[1] = (char)0xF9; // 1111 1 00 1? = syncword MPEG-2 Layer CRC
packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2));
packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11));
packet[4] = (char)((fullLength&0x7FF) >> 3);
packet[5] = (char)(((fullLength&7)<<5) + 0x1F);
packet[6] = (char)0xFC;
NSData *data = [NSData dataWithBytesNoCopy:packet length:adtsLength freeWhenDone:YES];
return data;
}