將最近學(xué)習(xí)的直播推流技術(shù)做個(gè)筆記允懂。
iOS推流的主要流程如下:
采集拿到每一幀的數(shù)據(jù)CMSampleBufferRef肮街, 接下來就是對(duì)CMSampleBufferRef進(jìn)行編碼丽惶,編碼分為軟編碼和硬編碼斋扰,軟編碼使用ffmpeg進(jìn)行編碼,屬于CPU編碼配喳,效率沒有硬編碼高蹂安,硬編碼使用GPU編碼,蘋果使用AudioToolbox和VideoToobox完成硬編碼彰阴,下面只介紹硬編碼的過程瘾敢。
視頻編碼
簡單來說使用VTCompressionSessionEncodeFrame函數(shù)進(jìn)行編碼,輸入原始幀數(shù)據(jù)硝枉,輸出編碼后的數(shù)據(jù),但是在調(diào)用VTCompressionSessionEncodeFrame函數(shù)前需要進(jìn)行相關(guān)對(duì)象初始化倦微,準(zhǔn)備編碼數(shù)據(jù)妻味,設(shè)置編碼參數(shù)。具體來說欣福,需要先初始化VTCompressionSessionRef對(duì)象责球,該對(duì)象是一個(gè)指針。初始化函數(shù)如下:
VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH2641, (__bridge void*)self, &compressSession);
didCompressH2641時(shí)候編碼回調(diào)函數(shù)拓劝,表示每一次編碼完成后會(huì)回調(diào)這個(gè)函數(shù)雏逾。
這里的VTCompressionSessionRef compressSession就是初始化的對(duì)象。然后對(duì)這個(gè)對(duì)象設(shè)置相關(guān)參數(shù)郑临,如輸入的幀數(shù)據(jù)的幀率栖博,碼率,GOP厢洞,視頻壓縮算法(H.264)等仇让,這樣VideoToobox才知道怎么編碼,最后調(diào)用
VTCompressionSessionPrepareToEncodeFrames(compressSession); 表示設(shè)置結(jié)束躺翻。下面顯示一段例子:
/// 這個(gè)在當(dāng)前oc對(duì)象初始化時(shí)調(diào)用丧叽,可以定義為一個(gè)編碼器。
- (void) setupCompressionSession {
aQuene = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
// 1. 第幾幀數(shù)據(jù)
_frameID = 0;
// 2. 視頻寬高
int width = _configuration.width, height = _configuration.height;
// 3.創(chuàng)建CompressionSession對(duì)象,該對(duì)象用于對(duì)畫面進(jìn)行編碼
// kCMVideoCodecType_H264 : 表示使用h.264進(jìn)行編碼
// didCompressH264 : 當(dāng)一次編碼結(jié)束會(huì)在該函數(shù)進(jìn)行回調(diào),可以在該函數(shù)中將數(shù)據(jù),寫入文件中
VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH2641, (__bridge void*)self, &compressSession);
// 4.設(shè)置實(shí)時(shí)編碼輸出(直播必然是實(shí)時(shí)輸出,否則會(huì)有延遲)
VTSessionSetProperty(compressSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);
// 5.設(shè)置期望幀率(每秒多少幀,如果幀率過低,會(huì)造成畫面卡頓)
int fps = _configuration.fps;
CFNumberRef fpsRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fps);
VTSessionSetProperty(compressSession, kVTCompressionPropertyKey_ExpectedFrameRate, fpsRef);
// 6.設(shè)置碼率(碼率: 編碼效率, 碼率越高,則畫面越清晰, 如果碼率較低會(huì)引起馬賽克 --> 碼率高有利于還原原始畫面,但是也不利于傳輸)
int bitRate = _configuration.bitRate;
CFNumberRef bitRateRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberSInt32Type, &bitRate);
VTSessionSetProperty(compressSession, kVTCompressionPropertyKey_AverageBitRate, bitRateRef);
NSArray *limit = @[@(bitRate * 1.5/8), @(1)];
VTSessionSetProperty(compressSession, kVTCompressionPropertyKey_DataRateLimits, (__bridge CFArrayRef)limit);
// 7.設(shè)置關(guān)鍵幀(GOPsize)間隔
int frameInterval = _configuration.keyframeInterval;
CFNumberRef frameIntervalRef = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &frameInterval);
VTSessionSetProperty(compressSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, frameIntervalRef);
// 8.基本設(shè)置結(jié)束, 準(zhǔn)備進(jìn)行編碼
VTCompressionSessionPrepareToEncodeFrames(compressSession);
}
準(zhǔn)備工作結(jié)束后公你,再調(diào)用編碼函數(shù)VTCompressionSessionEncodeFrame踊淳,先轉(zhuǎn)換原始數(shù)據(jù)CMSampleBufferRef為CVImageBufferRef,這個(gè)封裝了每一幀的數(shù)據(jù)陕靠。
設(shè)置時(shí)間戳迂尝,主要為了保證正確的推流和拉流播放順序脱茉。
具體代碼如下:
// 每次把當(dāng)前時(shí)間傳進(jìn)來
- (void)encode:(CMSampleBufferRef)sampleBuffer timeStamp:(uint64_t)timestamp
{
// 放在串行隊(duì)列,保證編碼順序雹舀,
dispatch_sync(aQuene, ^{
_frameID++;
// Get the CV Image buffer
CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);
// Create properties
CMTime presentationTimeStamp = CMTimeMake(_frameID, 1000);
//CMTime duration = CMTimeMake(1, DURATION);
VTEncodeInfoFlags flags;
NSDictionary *properties = nil;
// 設(shè)置當(dāng)前幀是否編碼為關(guān)鍵幀芦劣,每一輪GOP后設(shè)置一次關(guān)鍵幀
if (_frameID % (int32_t)_configuration.keyframeInterval == 0) {
properties = @{(__bridge NSString *)kVTEncodeFrameOptionKey_ForceKeyFrame: @YES};
}
NSNumber *timeNumber = @(timestamp);
// Pass it to the encoder
OSStatus statusCode = VTCompressionSessionEncodeFrame(compressSession,
imageBuffer,
presentationTimeStamp,
kCMTimeInvalid,
(__bridge CFDictionaryRef)properties, (__bridge void *)timeNumber, &flags);
// Check for error
if (statusCode != noErr) {
NSLog(@"H264: VTCompressionSessionEncodeFrame failed with %d", (int)statusCode);
return;
}
NSLog(@"H264: VTCompressionSessionEncodeFrame Success");
});
}
編碼完成后,會(huì)回調(diào)對(duì)應(yīng)的函數(shù)说榆,即上面說到的didCompressH2641虚吟, 注意蘋果的編碼并不是一幀一幀的編碼,來一幀數(shù)據(jù)就編碼一幀签财,為保證編碼效率串慰,同時(shí)由于IPB各幀相互直接的關(guān)聯(lián),它可能會(huì)累加到一定數(shù)量再進(jìn)行編碼唱蒸。
拿到編碼后的數(shù)據(jù)后邦鲫,先判斷是否是關(guān)鍵幀,如果是關(guān)鍵幀需要在關(guān)鍵幀前面插入sps & pps數(shù)據(jù)神汹,而sps和pps數(shù)據(jù)是從編碼后的數(shù)據(jù)取出來的庆捺。如果想寫入文件需要先寫入NALU的header。
例如我們寫入pps和sps數(shù)據(jù)時(shí)可以:
- (void)gotSpsPps:(NSData*)sps pps:(NSData*)pps
{
// 1.拼接NALU的header
const char bytes[] = "\x00\x00\x00\x01";
size_t length = (sizeof bytes) - 1;
NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
2.將NALU的頭&NALU的體寫入文件
// 拿到文件句柄fileHandle
[self.fileHandle writeData:ByteHeader];
[self.fileHandle writeData:sps];
[self.fileHandle writeData:ByteHeader];
[self.fileHandle writeData:pps];
}
寫入IBP幀:
- (void)gotEncodedData:(NSData*)data isKeyFrame:(BOOL)isKeyFrame
{
if (self.fileHandle != NULL)
{
const char bytes[] = "\x00\x00\x00\x01";
size_t length = (sizeof bytes) - 1; //string literals have implicit trailing '\0'
NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
[self.fileHandle writeData:ByteHeader];
[self.fileHandle writeData:data];
}
}
如果只是推流屁魏,可以不進(jìn)行上面兩步滔以。下面是編碼完成的回調(diào),代碼如下:
// 編碼完成回調(diào) sampleBuffer是編碼后的數(shù)據(jù)
void didCompressH2641(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer) {
// 1.判斷狀態(tài)是否等于沒有錯(cuò)誤
if (status != noErr) {
return;
}
// 2.根據(jù)傳入的參數(shù)獲取對(duì)象, 一般是當(dāng)前對(duì)象
VideoHWEncoder* encoder = (__bridge VideoHWEncoder*)outputCallbackRefCon;
uint64_t timeStamp = [((__bridge_transfer NSNumber *)sourceFrameRefCon) longLongValue];
// 3.判斷是否是關(guān)鍵幀
bool isKeyframe = !CFDictionaryContainsKey( (CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0)), kCMSampleAttachmentKey_NotSync);
// 判斷當(dāng)前幀是否為關(guān)鍵幀
// 獲取sps & pps數(shù)據(jù)
if (isKeyframe && !encoder->sps)
{
// 獲取編碼后的信息(存儲(chǔ)于CMFormatDescriptionRef中)
CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
// 獲取SPS信息
size_t sparameterSetSize, sparameterSetCount;
const uint8_t *sparameterSet;
CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0 );
// 獲取PPS信息
size_t pparameterSetSize, pparameterSetCount;
const uint8_t *pparameterSet;
CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0 );
// 裝sps/pps轉(zhuǎn)成NSData氓拼,以方便寫入文件
NSData *sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize];
NSData *pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize];
encoder->sps=sps;
encoder->pps=pps;
// 寫入文件
[encoder gotSpsPps:sps pps:pps];
}
// 獲取數(shù)據(jù)塊
CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t length, totalLength;
char *dataPointer;
OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer);
if (statusCodeRet == noErr) {
size_t bufferOffset = 0;
static const int AVCCHeaderLength = 4; // 返回的nalu數(shù)據(jù)前四個(gè)字節(jié)不是0001的startcode你画,而是大端模式的幀長度length
// 循環(huán)獲取nalu數(shù)據(jù),因?yàn)橛锌赡艽嬖诙鄠€(gè)nalu數(shù)據(jù)
while (bufferOffset < totalLength - AVCCHeaderLength) {
uint32_t NALUnitLength = 0;
// Read the NAL unit length
memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength);
// 從大端轉(zhuǎn)系統(tǒng)端
NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
NSData* data = [[NSData alloc] initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength];
[encoder gotEncodedData:data isKeyFrame:isKeyframe];
// 把數(shù)據(jù)保存到LFVideoFrame *frame中桃漾,供后續(xù)推流使用
LFVideoFrame *frame = [LFVideoFrame new];
// 是否是關(guān)鍵幀
frame.isKeyFrame = isKeyframe;
// pps數(shù)據(jù)
frame.pps = encoder->pps;
// sps數(shù)據(jù)
frame.sps = encoder->sps;
// I或P或B幀NAL數(shù)據(jù)
frame.data = data;
// 當(dāng)前時(shí)間戳
frame.timestamp = timeStamp;
// 回調(diào)給代理對(duì)象坏匪,一般是VC控制器,控制器進(jìn)行下一步操作即推流撬统。
if (encoder.delegate) {
[encoder.delegate encodedVideo:encoder videoFrame:frame];
}
// 移動(dòng)到寫一個(gè)塊适滓,轉(zhuǎn)成NALU單元
// Move to the next NAL unit in the block buffer
bufferOffset += AVCCHeaderLength + NALUnitLength;
}
}
}
好了,至此視頻編碼就完成了恋追,接下來是音頻編碼粒竖。
音頻編碼
上面視頻編碼代碼封裝到一個(gè)VideoHWEncoder類中,同樣音頻編碼也封裝到另一個(gè)類中几于,這里取名AudioHWEncoder蕊苗,跟視頻編碼比較類似,要先設(shè)置編碼參數(shù)沿彭,初始化相關(guān)對(duì)象朽砰,然后調(diào)用編碼函數(shù),在回調(diào)函數(shù)中處理編碼后的數(shù)據(jù)。
設(shè)置編碼參數(shù)聲道數(shù)瞧柔,音頻采樣率漆弄,編碼后的格式(AAC),每采樣點(diǎn)占用位數(shù)造锅,代碼如下:
// 配置編碼參數(shù)
- (void)setupEncoderFromSampleBuffer:(CMSampleBufferRef)sampleBuffer
{
NSLog(@"開始配置編碼參數(shù)撼唾。。哥蔚。倒谷。");
// 獲取原音頻聲音格式設(shè)置
AudioStreamBasicDescription inAudioStreamBasicDescription = *CMAudioFormatDescriptionGetStreamBasicDescription((CMAudioFormatDescriptionRef)CMSampleBufferGetFormatDescription(sampleBuffer));
AudioStreamBasicDescription outAudioStreamBasicDescription = {0};
// 下面設(shè)置輸出即編碼后的音頻參數(shù)
// 采樣率
outAudioStreamBasicDescription.mSampleRate = inAudioStreamBasicDescription.mSampleRate;
sampleRate = (NSInteger)inAudioStreamBasicDescription.mSampleRate;
channelsCount = (NSInteger)inAudioStreamBasicDescription.mChannelsPerFrame;
// 格式 kAudioFormatMPEG4AAC = 'aac' ,
outAudioStreamBasicDescription.mFormatID = kAudioFormatMPEG4AAC;
// 標(biāo)簽格式 無損編碼
outAudioStreamBasicDescription.mFormatFlags = kMPEG4Object_AAC_LC;
// 每個(gè)Packet 的 Bytes 數(shù)量 0:動(dòng)態(tài)大小格
outAudioStreamBasicDescription.mBytesPerPacket = 0;
// 每個(gè)Packet的幀數(shù)量,設(shè)置一個(gè)較大的固定值 1024
outAudioStreamBasicDescription.mFramesPerPacket = 1024;
// 每幀的Bytes數(shù)量
outAudioStreamBasicDescription.mBytesPerFrame = 0;
// 1 單聲道 2: 立體聲
outAudioStreamBasicDescription.mChannelsPerFrame = 1;
// 語言每采樣點(diǎn)占用位數(shù)
outAudioStreamBasicDescription.mBitsPerChannel = 0;
// 保留參數(shù)(對(duì)齊當(dāng)時(shí))
outAudioStreamBasicDescription.mReserved = 0;
// 獲取編碼器
//AudioClassDescription * description = [self getAudioClassDescriptionWithType:kAudioFormatMPEG4AAC fromManufacturer:kAppleSoftwareAudioCodecManufacturer];
// 創(chuàng)建編碼器
/*
inAudioStreamBasicDescription 傳入源音頻格式
outAudioStreamBasicDescription 目標(biāo)音頻格式
第三個(gè)參數(shù):傳入音頻編碼器的個(gè)數(shù)
description 傳入音頻編碼器的描述糙箍,不指定描述則使用系統(tǒng)默認(rèn)的
*/
//OSStatus status = AudioConverterNewSpecific(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, 1, description, &_audioConverter);
OSStatus status = AudioConverterNew(&inAudioStreamBasicDescription, &outAudioStreamBasicDescription, &_audioConverter);
UInt32 value = 0;
UInt32 size = sizeof(value);
AudioConverterGetProperty(_audioConverter, kAudioConverterPropertyMaximumOutputPacketSize, &size, &value);
// 初始化編碼后數(shù)據(jù)buffer
_aacBufferSize = value;
_aacBuffer = malloc(value);
if (status != 0) {
NSLog(@"創(chuàng)建編碼器失敗");
}
}
之后可以開始編碼:
// 編碼數(shù)據(jù)
- (void)encode:(CMSampleBufferRef)sampleBuffer timeStamp:(uint64_t)timeStamp{
CFRetain(sampleBuffer);
dispatch_sync(_encoderQueue, ^{
if (!self.audioConverter) {
// 配置編碼參數(shù)
[self setupEncoderFromSampleBuffer:sampleBuffer];
}
// 獲取CMBlockBufferRef
CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
CFRetain(blockBuffer);
// 獲取_pcmBufferSize 和 _pcmBuffer
OSStatus status = CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &self->_pcmBufferSize, &self->_pcmBuffer);
if (status != kCMBlockBufferNoErr) {
NSLog(@"獲取 pcmBuffer 數(shù)據(jù)錯(cuò)誤");
return ;
}
// 清空
memset(self->_aacBuffer, 0, self->_aacBufferSize);
// 初始化緩沖列表
AudioBufferList outAudioBufferList = {0}; // 結(jié)構(gòu)體
// 緩沖區(qū)個(gè)數(shù)
outAudioBufferList.mNumberBuffers = 1;
// 渠道個(gè)數(shù)
outAudioBufferList.mBuffers[0].mNumberChannels = 1;
// 緩存區(qū)大小
outAudioBufferList.mBuffers[0].mDataByteSize = (int)self->_aacBufferSize;
// 緩沖區(qū)內(nèi)容
outAudioBufferList.mBuffers[0].mData = self->_aacBuffer;
// 編碼
AudioStreamPacketDescription * outPD = NULL;
UInt32 inPutSize = 1;
/*
inInputDataProc 自己實(shí)現(xiàn)的編碼數(shù)據(jù)的callback引用
self 獲取的數(shù)據(jù)
inPutSize 輸出數(shù)據(jù)的長度
outAudioBUfferList 輸出的數(shù)據(jù)
outPD 輸出數(shù)據(jù)的描述
*/
status = AudioConverterFillComplexBuffer(self->_audioConverter,
inInputDataProc,
(__bridge void*)self,
&inPutSize,
&outAudioBufferList,
outPD
);
// 編碼后完成
NSData * data = nil;
if (status == noErr) {
// 獲取緩沖區(qū)的原始數(shù)據(jù)acc數(shù)據(jù)
NSData * rawAAC = [NSData dataWithBytes:outAudioBufferList.mBuffers[0].mData length:outAudioBufferList.mBuffers[0].mDataByteSize];
// 加頭ADTS渤愁,如果只做推流這一步是不需要做的,因?yàn)橥屏饕蟮臄?shù)據(jù)是不需要加頭ADTS
NSData * adtsHeader = [self adtsDataForPacketLength:rawAAC.length];
NSMutableData * fullData = [NSMutableData dataWithData:adtsHeader];
[fullData appendData:rawAAC];
data = fullData;
// 將rawAAC(不是fullData) 賦值給LFAudioFrame.data以做后續(xù)推流處理
LFAudioFrame *frame = [LFAudioFrame new];
frame.data = rawAAC;
frame.timestamp = timeStamp;
char exeData[2];
NSInteger sampleRateIndex = [self sampleRateIndex:sampleRate];
//exeData 這個(gè)是后續(xù)推流時(shí)用到的
exeData[0] = 0x10 | ((sampleRateIndex>>1) & 0x7);
exeData[1] = ((sampleRateIndex & 0x1)<<7) | ((channelsCount & 0xF) << 3);
frame.audioInfo = [NSData dataWithBytes:exeData length:2];
// 回調(diào)給控制器
if (self.delegate) {
[self.delegate encodedAudio:self audioFrame:frame];
}
NSLog(@"Output AAC data length: %li", rawAAC.length);
} else {
NSLog(@"數(shù)據(jù)錯(cuò)誤");
return;
}
// 回調(diào)
// if (completionBlock) {
// dispatch_async(_callBackQueue, ^{
// completionBlock(data, nil);
// });
// }
// 寫入數(shù)據(jù) //TODO:
//[self.audioFileHandle writeData:data];
CFRelease(sampleBuffer);
CFRelease(blockBuffer);
});
}
// 轉(zhuǎn)換采樣率
- (NSInteger)sampleRateIndex:(NSInteger)frequencyInHz {
NSInteger sampleRateIndex = 0;
switch (frequencyInHz) {
case 96000:
sampleRateIndex = 0;
break;
case 88200:
sampleRateIndex = 1;
break;
case 64000:
sampleRateIndex = 2;
break;
case 48000:
sampleRateIndex = 3;
break;
case 44100:
sampleRateIndex = 4;
break;
case 32000:
sampleRateIndex = 5;
break;
case 24000:
sampleRateIndex = 6;
break;
case 22050:
sampleRateIndex = 7;
break;
case 16000:
sampleRateIndex = 8;
break;
case 12000:
sampleRateIndex = 9;
break;
case 11025:
sampleRateIndex = 10;
break;
case 8000:
sampleRateIndex = 11;
break;
case 7350:
sampleRateIndex = 12;
break;
default:
sampleRateIndex = 15;
}
return sampleRateIndex;
}
// 按照adts格式要求深夯,加頭信息
- (NSData*)adtsDataForPacketLength:(NSUInteger)packetLength {
int adtsLength = 7;
char *packet = malloc(sizeof(char) * adtsLength);
int profile = 2;
int freqIdx = 4;
int chanCfg = 1;
NSUInteger fullLength = adtsLength + packetLength;
packet[0] = (char)0xFF;
packet[1] = (char)0xF9;
packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2));
packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11));
packet[4] = (char)((fullLength&0x7FF) >> 3);
packet[5] = (char)(((fullLength&7)<<5) + 0x1F);
packet[6] = (char)0xFC;
NSData *data = [NSData dataWithBytesNoCopy:packet length:adtsLength freeWhenDone:YES];
return data;
}
至此音頻編碼也完成了抖格。