之前的文章里已經(jīng)介紹了音頻碼流AAC和視頻碼流H.264微猖,接下來就是要把他們打包并封裝格式了涕刚。
FLV用途
FLV(Flash Video)是Adobe公司設(shè)計開發(fā)的一種流行的流媒體格式几晤,由于其視頻文件體積輕巧宫补、封裝簡單等特點盛撑,使其很適合在互聯(lián)網(wǎng)上進(jìn)行應(yīng)用碎节。 除了播放視頻,在直播時也可以使用抵卫。采用FLV格式封裝的文件后綴為.flv
FLV封裝格式
FLV封裝格式和其他格式一樣狮荔,一個文件頭+一個文件體。結(jié)構(gòu)如下圖:
文件頭
FLV文件頭總共有9Byte介粘,文件頭部的內(nèi)容結(jié)構(gòu)如下:
typedef unsigned char byte;
typedef struct {
byte Signature[3];
byte Version;
byte Flags;
uint DataOffset;
} FLV_HEADER;
其中Signature[3]是存儲了"F"(0x46)轴合,"L"(0x4c),“V”(0x56)簽名的3個字節(jié)碗短;然后是version版本受葛,表示FLV的版本,占1個字節(jié)偎谁;Flags是音頻流標(biāo)識总滩,前5位為保留位,第7位同樣保留巡雨,全為0闰渔,第6位表示是否存在音頻tag,第8位表示是否存在視頻tag铐望。剩下的4個字節(jié)為整個文件頭所占的大小,數(shù)據(jù)的起始位置就是從文件開頭偏移這么多的大小。
音頻tag解析
Flags音頻流標(biāo)志位的第6位是判斷是否存在音頻tag锻全,音頻tag的結(jié)構(gòu)如下:
音頻tag開始的第一個字節(jié)包含了音頻數(shù)據(jù)的參數(shù)信息泪漂,從第二個字節(jié)開始為音頻流赘风。如上圖所示,在第1個字節(jié)中劣砍,前4個bit為音頻編碼類型:
0 Linear PCM装畅,platform endian
1 ADPCM
2 MP3
3 Linear PCM蚂夕,little endian
4 Nellymoser 16-kHz mono
5 Nellymoser 8-kHz mono
6 Nellymoser
7 G.711 A-law logarithmic PCM
8 G.711 mu-law logarithmic PCM
9 reserved
10 AAC
14 MP3 8-Khz
15 Device-specific sound
第5牍汹、6個bit表示音頻采樣率:
0 5.5kHz
1 11KHz
2 22 kHz
3 44 kHz
FLV并不支持48KHz的采樣率
第7個bit表示音頻采樣精度:
0 8bits
1 16bits
第8個表示音頻類型:
0 sndMono
1 sndStereo
視頻tag解析
Flags音頻流標(biāo)志位的第8位是判斷是否存在視頻tag露该,視頻tag的結(jié)構(gòu)如下:
2個字節(jié)第一個字節(jié)包含視頻數(shù)據(jù)的參數(shù)信息,第2個字節(jié)為視頻流數(shù)據(jù)底靠。第一個字節(jié)的前4個bit的數(shù)值表示幀類型:
1 keyframe (for AVC害晦,a seekable frame)
2 inter frame (for AVC,a nonseekable frame)
3 disposable inter frame (H.263 only)
4 generated keyframe (reserved for server use)
5 video info/command frame
第1位的后4個bit的數(shù)值表示視頻編碼類型:
1 JPEG (currently unused)
2 Sorenson H.263
3 Screen video
4 On2 VP6
5 On2 VP6 with alpha channel
6 Screen video version 2
7 AVC
文件體
FLV文件體結(jié)構(gòu)如下:
FLV文件體是由一連串的back-pointers和tags構(gòu)成:
Tag
這里的tag同樣是由header和data兩部分組成暑中,tag header里存放的數(shù)據(jù)如下:
Tag的類型可以是視頻壹瘟、音頻和Script(腳本類型),視頻tag和音頻tag在上面已經(jīng)介紹過了鳄逾,下面介紹一下Script Tag這種類型稻轨。
typedef unsigned char byte;
typedef unsigned int uint;
typedef struct {
byte TagType;
byte DataSize[3];
byte Timestamp[3];
uint Reserved;
} TAG_HEADER;
Script Tag
該類型Tag又被稱為MetaData Tag,存放一些關(guān)于FLV視頻和音頻的元信息,比如:duration雕凹、width殴俱、height等。通常該類型Tag會作為FLV文件的第一個tag枚抵,并且只有一個线欲,跟在File Header后。該類型Tag Data的結(jié)構(gòu)如下所示:
針對第一個AMF包:第1個字節(jié)表示AMF類型俄精,一般為0x02询筏,表示字符串;第2竖慧、3個字節(jié)用來標(biāo)識字符長度嫌套,一般總是0x000A。后面字節(jié)為具體的字符串圾旨。
針對第二個AMF包:第1個字節(jié)表示AMF包類型踱讨,一般總是0x08,表示數(shù)組砍的。第2-5個字節(jié)表示數(shù)組元素個數(shù)痹筛,后面幾位各數(shù)組元素的封裝,數(shù)組元素為(名稱-值)組成的對:
duration 時長
width 視頻寬度
height 視頻高度
videodatarate 視頻碼率
framerate 視頻幀率
videocodecid 視頻編碼方式
audiosamplerate 音頻采樣率
audiosamplesize 音頻采樣精度
stereo 是否為立體聲
audiocodecid 音頻編碼方式
filesize 文件大小
FLV解析
來自雷神
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//Important!
#pragma pack(1)
#define TAG_TYPE_SCRIPT 18
#define TAG_TYPE_AUDIO 8
#define TAG_TYPE_VIDEO 9
typedef unsigned char byte;
typedef unsigned int uint;
typedef struct {
byte Signature[3];
byte Version;
byte Flags;
uint DataOffset;
} FLV_HEADER;
typedef struct {
byte TagType;
byte DataSize[3];
byte Timestamp[3];
uint Reserved;
} TAG_HEADER;
//reverse_bytes - turn a BigEndian byte array into a LittleEndian integer
uint reverse_bytes(byte *p, char c) {
int r = 0;
int i;
for (i=0; i<c; i++)
r |= ( *(p+i) << (((c-1)*8)-8*i));
return r;
}
/**
* Analysis FLV file
* @param url Location of input FLV file.
*/
int simplest_flv_parser(char *url){
//whether output audio/video stream
int output_a=1;
int output_v=1;
//-------------
FILE *ifh=NULL,*vfh=NULL, *afh = NULL;
//FILE *myout=fopen("output_log.txt","wb+");
FILE *myout=stdout;
FLV_HEADER flv;
TAG_HEADER tagheader;
uint previoustagsize, previoustagsize_z=0;
uint ts=0, ts_new=0;
ifh = fopen(url, "rb+");
if ( ifh== NULL) {
printf("Failed to open files!");
return -1;
}
//FLV file header
fread((char *)&flv,1,sizeof(FLV_HEADER),ifh);
fprintf(myout,"============== FLV Header ==============\n");
fprintf(myout,"Signature: 0x %c %c %c\n",flv.Signature[0],flv.Signature[1],flv.Signature[2]);
fprintf(myout,"Version: 0x %X\n",flv.Version);
fprintf(myout,"Flags : 0x %X\n",flv.Flags);
fprintf(myout,"HeaderSize: 0x %X\n",reverse_bytes((byte *)&flv.DataOffset, sizeof(flv.DataOffset)));
fprintf(myout,"========================================\n");
//move the file pointer to the end of the header
fseek(ifh, reverse_bytes((byte *)&flv.DataOffset, sizeof(flv.DataOffset)), SEEK_SET);
//process each tag
do {
previoustagsize = _getw(ifh);
fread((void *)&tagheader,sizeof(TAG_HEADER),1,ifh);
//int temp_datasize1=reverse_bytes((byte *)&tagheader.DataSize, sizeof(tagheader.DataSize));
int tagheader_datasize=tagheader.DataSize[0]*65536+tagheader.DataSize[1]*256+tagheader.DataSize[2];
int tagheader_timestamp=tagheader.Timestamp[0]*65536+tagheader.Timestamp[1]*256+tagheader.Timestamp[2];
char tagtype_str[10];
switch(tagheader.TagType){
case TAG_TYPE_AUDIO:sprintf(tagtype_str,"AUDIO");break;
case TAG_TYPE_VIDEO:sprintf(tagtype_str,"VIDEO");break;
case TAG_TYPE_SCRIPT:sprintf(tagtype_str,"SCRIPT");break;
default:sprintf(tagtype_str,"UNKNOWN");break;
}
fprintf(myout,"[%6s] %6d %6d |",tagtype_str,tagheader_datasize,tagheader_timestamp);
//if we are not past the end of file, process the tag
if (feof(ifh)) {
break;
}
//process tag by type
switch (tagheader.TagType) {
case TAG_TYPE_AUDIO:{
char audiotag_str[100]={0};
strcat(audiotag_str,"| ");
char tagdata_first_byte;
tagdata_first_byte=fgetc(ifh);
int x=tagdata_first_byte&0xF0;
x=x>>4;
switch (x)
{
case 0:strcat(audiotag_str,"Linear PCM, platform endian");break;
case 1:strcat(audiotag_str,"ADPCM");break;
case 2:strcat(audiotag_str,"MP3");break;
case 3:strcat(audiotag_str,"Linear PCM, little endian");break;
case 4:strcat(audiotag_str,"Nellymoser 16-kHz mono");break;
case 5:strcat(audiotag_str,"Nellymoser 8-kHz mono");break;
case 6:strcat(audiotag_str,"Nellymoser");break;
case 7:strcat(audiotag_str,"G.711 A-law logarithmic PCM");break;
case 8:strcat(audiotag_str,"G.711 mu-law logarithmic PCM");break;
case 9:strcat(audiotag_str,"reserved");break;
case 10:strcat(audiotag_str,"AAC");break;
case 11:strcat(audiotag_str,"Speex");break;
case 14:strcat(audiotag_str,"MP3 8-Khz");break;
case 15:strcat(audiotag_str,"Device-specific sound");break;
default:strcat(audiotag_str,"UNKNOWN");break;
}
strcat(audiotag_str,"| ");
x=tagdata_first_byte&0x0C;
x=x>>2;
switch (x)
{
case 0:strcat(audiotag_str,"5.5-kHz");break;
case 1:strcat(audiotag_str,"1-kHz");break;
case 2:strcat(audiotag_str,"22-kHz");break;
case 3:strcat(audiotag_str,"44-kHz");break;
default:strcat(audiotag_str,"UNKNOWN");break;
}
strcat(audiotag_str,"| ");
x=tagdata_first_byte&0x02;
x=x>>1;
switch (x)
{
case 0:strcat(audiotag_str,"8Bit");break;
case 1:strcat(audiotag_str,"16Bit");break;
default:strcat(audiotag_str,"UNKNOWN");break;
}
strcat(audiotag_str,"| ");
x=tagdata_first_byte&0x01;
switch (x)
{
case 0:strcat(audiotag_str,"Mono");break;
case 1:strcat(audiotag_str,"Stereo");break;
default:strcat(audiotag_str,"UNKNOWN");break;
}
fprintf(myout,"%s",audiotag_str);
//if the output file hasn't been opened, open it.
if(output_a!=0&&afh == NULL){
afh = fopen("output.mp3", "wb");
}
//TagData - First Byte Data
int data_size=reverse_bytes((byte *)&tagheader.DataSize, sizeof(tagheader.DataSize))-1;
if(output_a!=0){
//TagData+1
for (int i=0; i<data_size; i++)
fputc(fgetc(ifh),afh);
}else{
for (int i=0; i<data_size; i++)
fgetc(ifh);
}
break;
}
case TAG_TYPE_VIDEO:{
char videotag_str[100]={0};
strcat(videotag_str,"| ");
char tagdata_first_byte;
tagdata_first_byte=fgetc(ifh);
int x=tagdata_first_byte&0xF0;
x=x>>4;
switch (x)
{
case 1:strcat(videotag_str,"key frame ");break;
case 2:strcat(videotag_str,"inter frame");break;
case 3:strcat(videotag_str,"disposable inter frame");break;
case 4:strcat(videotag_str,"generated keyframe");break;
case 5:strcat(videotag_str,"video info/command frame");break;
default:strcat(videotag_str,"UNKNOWN");break;
}
strcat(videotag_str,"| ");
x=tagdata_first_byte&0x0F;
switch (x)
{
case 1:strcat(videotag_str,"JPEG (currently unused)");break;
case 2:strcat(videotag_str,"Sorenson H.263");break;
case 3:strcat(videotag_str,"Screen video");break;
case 4:strcat(videotag_str,"On2 VP6");break;
case 5:strcat(videotag_str,"On2 VP6 with alpha channel");break;
case 6:strcat(videotag_str,"Screen video version 2");break;
case 7:strcat(videotag_str,"AVC");break;
default:strcat(videotag_str,"UNKNOWN");break;
}
fprintf(myout,"%s",videotag_str);
fseek(ifh, -1, SEEK_CUR);
//if the output file hasn't been opened, open it.
if (vfh == NULL&&output_v!=0) {
//write the flv header (reuse the original file's hdr) and first previoustagsize
vfh = fopen("output.flv", "wb");
fwrite((char *)&flv,1, sizeof(flv),vfh);
fwrite((char *)&previoustagsize_z,1,sizeof(previoustagsize_z),vfh);
}
#if 0
//Change Timestamp
//Get Timestamp
ts = reverse_bytes((byte *)&tagheader.Timestamp, sizeof(tagheader.Timestamp));
ts=ts*2;
//Writeback Timestamp
ts_new = reverse_bytes((byte *)&ts, sizeof(ts));
memcpy(&tagheader.Timestamp, ((char *)&ts_new) + 1, sizeof(tagheader.Timestamp));
#endif
//TagData + Previous Tag Size
int data_size=reverse_bytes((byte *)&tagheader.DataSize, sizeof(tagheader.DataSize))+4;
if(output_v!=0){
//TagHeader
fwrite((char *)&tagheader,1, sizeof(tagheader),vfh);
//TagData
for (int i=0; i<data_size; i++)
fputc(fgetc(ifh),vfh);
}else{
for (int i=0; i<data_size; i++)
fgetc(ifh);
}
//rewind 4 bytes, because we need to read the previoustagsize again for the loop's sake
fseek(ifh, -4, SEEK_CUR);
break;
}
default:
//skip the data of this tag
fseek(ifh, reverse_bytes((byte *)&tagheader.DataSize, sizeof(tagheader.DataSize)), SEEK_CUR);
}
fprintf(myout,"\n");
} while (!feof(ifh));
_fcloseall();
return 0;
}
此程序可以分離FLV中的視頻碼流和音頻碼流廓鞠。
首先定義了兩個header帚稠,F(xiàn)LV_HEADER和TAG_HEADER,然后reverse_bytes()
是將大端存儲的byte array翻轉(zhuǎn)為小端存儲的整型:
uint reverse_bytes(byte *p, char c) {
int r = 0;
int i;
for (i=0; i<c; i++) //直接和char型的c比較是因為char是“短的int”
r |= ( *(p+i) << (((c-1)*8)-8*i));
return r;
}
他的用法是這樣的:reverse_bytes((byte *)&flv.DataOffset, sizeof(flv.DataOffset))
床佳,然后用一個循環(huán)滋早,將內(nèi)存中的數(shù)據(jù)先后通過|=
與整型進(jìn)行比較,從而變成了小端存儲的整型砌们。
接著在simplest_flv_parser函數(shù)中先讀文件頭杆麸,再讀文件體搁进。然后一個do-while大循環(huán),直到讀取到文件結(jié)尾昔头。利用 _getw()函數(shù) (是查找碼流中的整數(shù)的)讀取Previous tag size饼问,然后fread((void *)&tagheader,sizeof(TAG_HEADER),1,ifh);
讀取tag header,再獲取tag data的字節(jié)數(shù):int tagheader_datasize=tagheader.DataSize[0]*pow(2, 16)+tagheader.DataSize[1]*pow(2, 8)+tagheader.DataSize[2];
揭斧,獲取時戳:int tagheader_timestamp=tagheader.Timestamp[0]*pow(2, 16)+tagheader.Timestamp[1]*pow(2, 8)+tagheader.Timestamp[2];
接著就在之前讀取的tag header中tag typr所在的字節(jié)中莱革,進(jìn)行switch判斷:
如果是音頻:
case TAG_TYPE_AUDIO:{ //音頻
char audiotag_str[100]={0};
strcat(audiotag_str,"| ");
char tagdata_first_byte;
//讀取一個字符,音頻tag data區(qū)域的第一個字節(jié)未蝌,音頻的信息
tagdata_first_byte=fgetc(ifh);
// &操作獲取前四位驮吱,代表音頻格式
int x=tagdata_first_byte&0xF0;
//右移4位
x=x>>4;
//判斷音頻格式
switch (x)
{
case 0:strcat(audiotag_str,"Linear PCM, platform endian");break;
case 1:strcat(audiotag_str,"ADPCM");break;
case 2:strcat(audiotag_str,"MP3");break;
case 3:strcat(audiotag_str,"Linear PCM, little endian");break;
case 4:strcat(audiotag_str,"Nellymoser 16-kHz mono");break;
case 5:strcat(audiotag_str,"Nellymoser 8-kHz mono");break;
case 6:strcat(audiotag_str,"Nellymoser");break;
case 7:strcat(audiotag_str,"G.711 A-law logarithmic PCM");break;
case 8:strcat(audiotag_str,"G.711 mu-law logarithmic PCM");break;
case 9:strcat(audiotag_str,"reserved");break;
case 10:strcat(audiotag_str,"AAC");break;
case 11:strcat(audiotag_str,"Speex");break;
case 14:strcat(audiotag_str,"MP3 8-Khz");break;
case 15:strcat(audiotag_str,"Device-specific sound");break;
default:strcat(audiotag_str,"UNKNOWN");break;
}
strcat(audiotag_str,"| ");
//獲取5~6位茧妒,采樣率
x=tagdata_first_byte&0x0C;
//右移2位
x=x>>2;
//判斷采樣率
switch (x)
{
case 0:strcat(audiotag_str,"5.5-kHz");break;
case 1:strcat(audiotag_str,"1-kHz");break;
case 2:strcat(audiotag_str,"22-kHz");break;
case 3:strcat(audiotag_str,"44-kHz");break;
default:strcat(audiotag_str,"UNKNOWN");break;
}
strcat(audiotag_str,"| ");
//獲取第7位萧吠,采樣精度
x=tagdata_first_byte&0x02;
x=x>>1;
switch (x)
{
case 0:strcat(audiotag_str,"8Bit");break;
case 1:strcat(audiotag_str,"16Bit");break;
default:strcat(audiotag_str,"UNKNOWN");break;
}
strcat(audiotag_str,"| ");
//獲取第8位,音頻聲道數(shù)
x=tagdata_first_byte&0x01;
switch (x)
{
case 0:strcat(audiotag_str,"Mono");break;
case 1:strcat(audiotag_str,"Stereo");break;
default:strcat(audiotag_str,"UNKNOWN");break;
}
fprintf(myout,"%s",audiotag_str);
//if the output file hasn't been opened, open it.
if(output_a!=0&&afh == NULL){
afh = fopen("output.mp3", "wb");
}
//TagData - First Byte Data
//獲取tag Data字節(jié)數(shù),需要減去Tag Data區(qū)域的第一個字節(jié)
int data_size=reverse_bytes((byte *)&tagheader.DataSize, sizeof(tagheader.DataSize))-1;
//循環(huán)獲取字節(jié)寫入文件
if(output_a!=0){
//TagData+1
for (int i=0; i<data_size; i++)
fputc(fgetc(ifh),afh);
}else{
for (int i=0; i<data_size; i++)
fgetc(ifh);
}
break;
}
首先先讀取文件頭桐筏,再將音頻流導(dǎo)出成mp3格式纸型,看了上面的音頻tag應(yīng)該很好理解這段代碼。
如果是視頻:
case TAG_TYPE_VIDEO:{ //視頻
char videotag_str[100]={0};
strcat(videotag_str,"| ");
//讀取TagData區(qū)域第一個字節(jié)梅忌,取出前4位狰腌。包含視頻幀類型
char tagdata_first_byte;
tagdata_first_byte=fgetc(ifh);
int x=tagdata_first_byte&0xF0;
x=x>>4;
switch (x)
{
case 1:strcat(videotag_str,"key frame ");break;
case 2:strcat(videotag_str,"inter frame");break;
case 3:strcat(videotag_str,"disposable inter frame");break;
case 4:strcat(videotag_str,"generated keyframe");break;
case 5:strcat(videotag_str,"video info/command frame");break;
default:strcat(videotag_str,"UNKNOWN");break;
}
strcat(videotag_str,"| ");
//讀取TagData區(qū)域第一個字節(jié),取出后4位牧氮。包含視頻編碼類型
x=tagdata_first_byte&0x0F;
switch (x)
{
case 1:strcat(videotag_str,"JPEG (currently unused)");break;
case 2:strcat(videotag_str,"Sorenson H.263");break;
case 3:strcat(videotag_str,"Screen video");break;
case 4:strcat(videotag_str,"On2 VP6");break;
case 5:strcat(videotag_str,"On2 VP6 with alpha channel");break;
case 6:strcat(videotag_str,"Screen video version 2");break;
case 7:strcat(videotag_str,"AVC");break;
default:strcat(videotag_str,"UNKNOWN");break;
}
fprintf(myout,"%s",videotag_str);
fseek(ifh, -1, SEEK_CUR);
//if the output file hasn't been opened, open it.
if (vfh == NULL&&output_v!=0) {
//write the flv header (reuse the original file's hdr) and first previoustagsize
vfh = fopen("output.flv", "wb");
fwrite((char *)&flv,1, sizeof(flv),vfh);
fwrite((char *)&previoustagsize_z,1,sizeof(previoustagsize_z),vfh);
}
#if 0
//Change Timestamp
//Get Timestamp
ts = reverse_bytes((byte *)&tagheader.Timestamp, sizeof(tagheader.Timestamp));
ts=ts*2;
//Writeback Timestamp
ts_new = reverse_bytes((byte *)&ts, sizeof(ts));
memcpy(&tagheader.Timestamp, ((char *)&ts_new) + 1, sizeof(tagheader.Timestamp));
#endif
//TagData + Previous Tag Size
int data_size=reverse_bytes((byte *)&tagheader.DataSize, sizeof(tagheader.DataSize))+4;
if(output_v!=0){
//TagHeader
fwrite((char *)&tagheader,1, sizeof(tagheader),vfh);
//TagData
for (int i=0; i<data_size; i++)
fputc(fgetc(ifh),vfh);
}else{
for (int i=0; i<data_size; i++)
fgetc(ifh);
}
//rewind 4 bytes, because we need to read the previoustagsize again for the loop's sake
fseek(ifh, -4, SEEK_CUR);
break;
}
結(jié)果如下: