項(xiàng)目暫時告一段落气筋,也是一知半解队贱,不過我的分享可以幫助我這樣菜鳥了摔刁。
先來下知識結(jié)構(gòu):
1日熬、h264
視頻編碼處理的最后一步就是熵編碼融欧,在H.264中采用了兩種不同的熵編碼方法:通用可變長編碼(UVLC)和基于文本的自適應(yīng)二進(jìn)制算術(shù)編碼(CABAC)腻扇。
2糠悼、aac
Advanced Audio Coding藐不。一種專為聲音數(shù)據(jù)設(shè)計(jì)的文件壓縮格式滋恬,與MP3不同聊训,它采用了全新的算法進(jìn)行編碼,更加高效恢氯,具有更高的“性價比”带斑。利用AAC格式鼓寺,可使人感覺聲音質(zhì)量沒有明顯降低
3、pcm
音頻采集的原始數(shù)據(jù)勋磕,硬編碼數(shù)據(jù)
4妈候、yuv
視頻采集的原始數(shù)據(jù),硬編碼數(shù)據(jù)
5朋凉、時間戳
直播音視頻同步的關(guān)鍵參數(shù)
6州丹、rtmp推流
直播的推流手段
一、我們首先要做的就是采集杂彭,我們需要采集硬編碼數(shù)據(jù)yuv將其轉(zhuǎn)化成h264墓毒,然后采集pcm數(shù)據(jù),并將其轉(zhuǎn)化成aac數(shù)據(jù)亲怠,并發(fā)送
1所计、視頻采集,我們要采集最后要轉(zhuǎn)化為h264編碼的格式团秽,需要用到VideoToolbox.framework及AVFoundation.framework
VideoToolbox.framework 的主要工作是編碼主胧,將yuv數(shù)據(jù)編碼為h264。AVFoundation.framework的任務(wù)是采集yuv原始數(shù)據(jù)习勤。
// 獲取硬編碼數(shù)據(jù)函數(shù)踪栋,一些初始化工作就不在這里熬述了,網(wǎng)上有很多
-(void) captureOutput:(AVCaptureOutput*)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection*)connection
{
}
(1)初始化VTCompressionSession图毕。
VTCompressionSession初始化的時候需要給出width寬夷都,height長,編碼器類型kCMVideoCodecType_H264等予颤。然后通過調(diào)用VTSessionSetProperty接口設(shè)置幀率等屬性囤官,最后需要設(shè)定一個回調(diào)函數(shù),這個回調(diào)是視頻圖像編碼成功后調(diào)用蛤虐。全部準(zhǔn)備好后党饮,使用VTCompressionSessionCreate創(chuàng)建session。
// 這個函數(shù)是初始化
- (void) initEncode:(int)width height:(int)height bite:(int)iBite
{
dispatch_sync(aQueue, ^{
// For testing out the logic, lets read from a file and then send it to encoder to create h264 stream
// Create the compression session 注意h264函數(shù)
OSStatus status = VTCompressionSessionCreate(NULL, width, height, kCMVideoCodecType_H264, NULL, NULL, NULL, didCompressH264, (__bridge void *)(self), &EncodingSession);
NSLog(@"H264: VTCompressionSessionCreate %d", (int)status);
if (status != 0)
{
NSLog(@"H264: Unable to create a H264 session");
error = @"H264: Unable to create a H264 session";
return ;
}
// 碼率是清晰度
// Set the properties
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_RealTime, kCFBooleanTrue);
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_AllowFrameReordering, kCFBooleanFalse);
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_MaxKeyFrameInterval, (__bridge CFTypeRef _Nonnull)(@(GOP_SIZE)));
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_ProfileLevel, kVTProfileLevel_H264_Main_AutoLevel);
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_AverageBitRate, (__bridge CFTypeRef _Nonnull)@(iBite));
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_ExpectedFrameRate, (__bridge CFTypeRef _Nonnull)@(FRAME_RATE));
VTSessionSetProperty(EncodingSession, kVTCompressionPropertyKey_DataRateLimits, (__bridge CFTypeRef _Nonnull)@[@(iBite/8),@(1)]);
// Tell the encoder to start encoding
VTCompressionSessionPrepareToEncodeFrames(EncodingSession);
});
}
(2)提取攝像頭采集的原始圖像數(shù)據(jù)給VTCompressionSession來硬編碼驳庭。
攝像頭采集后的圖像是未編碼的CMSampleBuffer形式刑顺,利用給定的接口函數(shù)CMSampleBufferGetImageBuffer從中提取出CVPixelBufferRef,使用硬編碼接口VTCompressionSessionEncodeFrame來對該幀進(jìn)行硬編碼嚷掠,編碼成功后捏检,會自動調(diào)用session初始化時設(shè)置的回調(diào)函數(shù)。
dispatch_sync(aQueue, ^{
frameCount++;
// Get the CV Image buffer 提取攝像頭采集的原始圖像數(shù)據(jù)給VTCompressionSession來硬編碼 也就是給VTCompressionSessionCreate來編碼
CVImageBufferRef imageBuffer = (CVImageBufferRef)CMSampleBufferGetImageBuffer(sampleBuffer);
// Create properties
CMTime presentationTimeStamp = CMTimeMake(frameCount, 1000);
//CMTime duration = CMTimeMake(1, DURATION);
VTEncodeInfoFlags flags;
// Pass it to the encoder
OSStatus statusCode = VTCompressionSessionEncodeFrame(EncodingSession,
imageBuffer,
presentationTimeStamp,
kCMTimeInvalid,
NULL, NULL, &flags);
// Check for error
if (statusCode != noErr) {
NSLog(@"H264: VTCompressionSessionEncodeFrame failed with %d", (int)statusCode);
error = @"H264: VTCompressionSessionEncodeFrame failed ";
// End the session
VTCompressionSessionInvalidate(EncodingSession);
CFRelease(EncodingSession);
EncodingSession = NULL;
error = NULL;
return;
}
// NSLog(@"H264: VTCompressionSessionEncodeFrame Success");
});
(3)利用回調(diào)函數(shù)不皆,將因編碼成功的CMSampleBuffer轉(zhuǎn)換成H264碼流贯城,通過網(wǎng)絡(luò)傳播。
基本上是硬解碼的一個逆過程霹娄。
void didCompressH264(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags,
CMSampleBufferRef sampleBuffer )
{
// NSLog(@"didCompressH264 called with status %d infoFlags %d", (int)status, (int)infoFlags);
NSLog(@"H264");
if (status != 0) return;
if (!CMSampleBufferDataIsReady(sampleBuffer))
{
NSLog(@"didCompressH264 data is not ready ");
return;
}
H264Encoder* encoder = (__bridge H264Encoder*)outputCallbackRefCon;
// Check if we have got a key frame first
bool keyframe = !CFDictionaryContainsKey( (CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0)), kCMSampleAttachmentKey_NotSync);
encoder->countFrame=encoder->countFrame+1;
// NSLog(@"dzf frameCount%d",encoder->countFrame);
if (keyframe)
{
// NSLog(@"dzf keyframe is true ");
CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
// CFDictionaryRef extensionDict = CMFormatDescriptionGetExtensions(format);
// Get the extensions
// From the extensions get the dictionary with key "SampleDescriptionExtensionAtoms"
// From the dict, get the value for the key "avcC"
size_t sparameterSetSize, sparameterSetCount;
const uint8_t *sparameterSet;
OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sparameterSet, &sparameterSetSize, &sparameterSetCount, 0 );
if (statusCode == noErr)
{
// Found sps and now check for pps
size_t pparameterSetSize, pparameterSetCount;
const uint8_t *pparameterSet;
OSStatus statusCode = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pparameterSet, &pparameterSetSize, &pparameterSetCount, 0 );
if (statusCode == noErr)
{
// Found pps
encoder->sps = [NSData dataWithBytes:sparameterSet length:sparameterSetSize];
encoder->pps = [NSData dataWithBytes:pparameterSet length:pparameterSetSize];
if (encoder->_delegate)
{
[encoder->_delegate gotSpsPps:encoder->sps pps:encoder->pps];
}
}
}
}
CMBlockBufferRef dataBuffer = CMSampleBufferGetDataBuffer(sampleBuffer);
size_t length, totalLength;
char *dataPointer;
OSStatus statusCodeRet = CMBlockBufferGetDataPointer(dataBuffer, 0, &length, &totalLength, &dataPointer);
if (statusCodeRet == noErr) {
// 發(fā)送數(shù)據(jù)
size_t bufferOffset = 0;
static const int AVCCHeaderLength = 4;
while (bufferOffset < totalLength - AVCCHeaderLength) {
// Read the NAL unit length
uint32_t NALUnitLength = 0;
memcpy(&NALUnitLength, dataPointer + bufferOffset, AVCCHeaderLength);
// Convert the length value from Big-endian to Little-endian
NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
NSData* data = [[NSData alloc] initWithBytes:(dataPointer + bufferOffset + AVCCHeaderLength) length:NALUnitLength];
[encoder->_delegate gotEncodedData:data isKeyFrame:keyframe];
// Move to the next NAL unit in the block buffer
bufferOffset += AVCCHeaderLength + NALUnitLength;
}
// 你小子存的數(shù)據(jù)
[encoder->_delegate oneFrameEncodeEnd:keyframe];
}
}
值得注意的是一段視頻的頭部是sps pps 組成的能犯,我們在這個函數(shù)中要檢查頭部信息鲫骗,篩選普通信息進(jìn)行封裝發(fā)送推流。先發(fā)送頭部數(shù)據(jù)再發(fā)送普通視頻數(shù)據(jù)踩晶。
解析出參數(shù)集SPS和PPS执泰,加上開始碼后組裝成NALU。提取出視頻數(shù)據(jù)渡蜻,將長度碼轉(zhuǎn)換成開始碼术吝,組長成NALU。將NALU發(fā)送出去茸苇。
發(fā)送視頻頭部信息代碼
- (void)gotSpsPps:(NSData*)sps pps:(NSData*)pps
{
// NSLog(@"gotSpsPps");
frameCount2 = [_h264Encoder getFreameCound];
const char bytes[] = "\x00\x00\x00\x01";
size_t length = (sizeof bytes) - 1; //string literals have implicit trailing '\0'
NSData *ByteHeader = [NSData dataWithBytes:bytes length:length];
mysps = sps;
mypps = pps;
[mutableData appendData:ByteHeader];
[mutableData appendData:mysps];
[mutableData appendData:ByteHeader];
[mutableData appendData:mypps];
pos = pos + sps.length + pps.length + ByteHeader.length*2;
NSMutableData *mutableDataTem1 = [[NSMutableData alloc] init];;
[mutableDataTem1 appendData:ByteHeader];
[mutableDataTem1 appendData:mysps];
long tem1 = sps.length + ByteHeader.length;
[self sendData:sizeof(Byte)*tem1 data:(char*)[mutableDataTem1 bytes]];
NSMutableData *mutableDataTem = [[NSMutableData alloc] init];;
[mutableDataTem appendData:ByteHeader];
[mutableDataTem appendData:mypps];
long tem = pps.length + ByteHeader.length;
[self sendData:sizeof(Byte)*tem data:(char*)[mutableDataTem bytes]];
}
發(fā)送實(shí)體部分代碼
- (void)oneFrameEncodeEnd:(BOOL)isKeyFrame
{
FrameData *frameData = [[FrameData alloc] init];
frameData.Iframe = isKeyFrame;
frameData.frame_len = (int) pos;
frameData.frame_seq = total_vseq;
frameData.stream_index = 0;
frameData.frame_data = (Byte *)malloc(sizeof(Byte)*pos);//new Byte[pos];
memcpy(frameData.frame_data,[mutableData bytes], pos*sizeof(Byte));
[_videoArray addObject:frameData];
total_vseq++;
//if(isKeyFrame)
//NSLog(@"add one h264 h264 h264 frame to videoArray---seq:%ld",total_vseq);
mysps = nil;
mypps = nil;
[mutableData resetBytesInRange:NSMakeRange(0, [mutableData length])];
[mutableData setLength:0];
pos = 0;
}
音頻的采集發(fā)送
將采集pcm數(shù)據(jù)進(jìn)行aac編碼排苍,網(wǎng)上應(yīng)該有相關(guān)的代碼可以學(xué)習(xí)
-(void) captureOutput:(AVCaptureOutput*)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection*)connection
{
static BOOL firstStartTimer = false;
static long num = 0;
if (connection == _audioConnection) {
NSLog(@"captureOutput audio");
char szBuf[4096];
memset(szBuf, 0, sizeof(szBuf));
uint32_t nSize = sizeof(szBuf);
// AudioStreamBasicDescription inputFormat = *(CMAudioFormatDescriptionGetStreamBasicDescription(CMSampleBufferGetFormatDescription(sampleBuffer))); // 輸入音頻格式
AudioStreamBasicDescription outputFormat = *(CMAudioFormatDescriptionGetStreamBasicDescription(CMSampleBufferGetFormatDescription(sampleBuffer)));
nSize = CMSampleBufferGetTotalSampleSize(sampleBuffer);
CMBlockBufferRef databuf = CMSampleBufferGetDataBuffer(sampleBuffer);
if (CMBlockBufferCopyDataBytes(databuf, 0, nSize, szBuf) == kCMBlockBufferNoErr)
{
int32_t nOffSet = 0;
while (nOffSet < nSize)
{
int outsize = 0;
char szOutBuf[4096] = {0};
int nInSize = 0;
if (nSize - nOffSet >= 640) {
nInSize = 640;
} else {
nInSize = nSize - nOffSet;
}
outsize = [ecdoer AACEncoderEncode:lHand inData:szBuf + nOffSet inSize:nInSize outData:szOutBuf maxOutSize:4096];
// [ecdoer AACEncoderClose:outsize];
if (outsize > 0)
{
[self sendAacDataLen:outsize data:szOutBuf ptsTime:0];
}
nOffSet += 640;
}
}
}
}
音頻數(shù)據(jù)的發(fā)送
- (void)sendAacDataLen:(int) totalLength data: (char*) dataPointer ptsTime:(int64_t)pts{
int ret = WM_RTMPLIVESDK_InputData(WMRtmpLiveDataType_AAC, (const char* )dataPointer, totalLength, [self getNowTime]);
NSLog(@"~~~~~~~~~iAAc[%lld]",[self getNowTime]);
if (ret == 1) {
NSLog(@"~~~~~aac~~~~~sendData ret[%d] totalLength[%d]",ret,(int)totalLength);
}
// fail 1 success 0
}
rtmp推流網(wǎng)上也有很多代碼,調(diào)用rtmplib 可以自己用c++封裝一個庫用來調(diào)用学密。
二淘衙、最后的陳述
這里就先不解釋了大體采集發(fā)送的過程就是這樣,還有一點(diǎn)視頻采集發(fā)送音頻采集發(fā)送的時間獲取的是當(dāng)前時間腻暮,測試的時候也可以寫間隔20ms來測試延遲的問題彤守,它的邏輯是發(fā)送一堆音頻數(shù)據(jù)再發(fā)送一個視頻數(shù)據(jù),因?yàn)橐纛l數(shù)據(jù)比較多哭靖,音頻數(shù)據(jù)如果丟幀會感覺出來明顯的卡頓具垫,視頻則不是,視頻丟一幀人眼是很難發(fā)現(xiàn)的
有些詳細(xì)的理論推薦大家看這篇博客试幽。
http://www.reibang.com/p/a6530fa46a88