基本原理
通過手機的攝像頭(Camera)進行視頻采集忌警,音頻采集(AudioRecord)后通過x264進行視頻編碼止潘,faac進行音頻編碼掺炭,然后通過 RTMPDump 實現(xiàn)打包推流。
視頻格式簡析
android 攝像頭采集的數(shù)據(jù)格式為 nv21凭戴,而我們看到的視頻的編碼前的格式為 i420 涧狮,所以要進行轉(zhuǎn)換,這兩種格式都屬于 yuv420,只是存儲方式上的差異者冤。
以 4*4 的圖片為例看一下 i420肤视、yv12、nv21涉枫、nv12 存儲格式的區(qū)別:
i420 | |||
---|---|---|---|
y1 | y2 | y3 | y4 |
y5 | y6 | y7 | y8 |
y9 | y10 | y11 | y12 |
y13 | y14 | y15 | y16 |
u1 | u2 | u3 | u4 |
v1 | v2 | v3 | v4 |
yv12 | |||
---|---|---|---|
y1 | y2 | y3 | y4 |
y5 | y6 | y7 | y8 |
y9 | y10 | y11 | y12 |
y13 | y14 | y15 | y16 |
v1 | v2 | v3 | v4 |
u1 | u2 | u3 | u4 |
nv21 | |||
---|---|---|---|
y1 | y2 | y3 | y4 |
y5 | y6 | y7 | y8 |
y9 | y10 | y11 | y12 |
y13 | y14 | y15 | y16 |
v1 | u1 | v2 | u2 |
v3 | u3 | v4 | u4 |
nv12 | |||
---|---|---|---|
y1 | y2 | y3 | y4 |
y5 | y6 | y7 | y8 |
y9 | y10 | y11 | y12 |
y13 | y14 | y15 | y16 |
u1 | v1 | u2 | v2 |
u3 | v3 | u4 | v4 |
四種在 y 部分完全一樣邢滑,i420 和 yv12 的 u 和 v 分量是單獨儲存的,而 nv21 和
nv12 的 v 和 u 是交替存儲的愿汰。
i420 困后、yv12 和 nv21 、nv12 一樣衬廷,都是 v 和 u 分量的位置是顛倒的操灿。
攝像頭采集數(shù)據(jù)的坑
做過攝像頭的開發(fā)的都知道,android 攝像頭拍出來的視頻是橫向的泵督,需要設(shè)置攝像頭的顯示方向來將攝像頭setDisplayOrientation(90)
給“掰過來”。
引用 Camera 類給出的示例代碼:
android.hardware.Camera.CameraInfo info =
new android.hardware.Camera.CameraInfo();
Camera.getCameraInfo(videoParam.getCameraId(), info);
int rotation = activity.getWindowManager().getDefaultDisplay().getRotation();
int degrees = 0;
switch (rotation) {
case Surface.ROTATION_0:
degrees = 0;
pusherNative.setVideoOptions(videoParam.getHeight(), videoParam.getWidth(), videoParam.getBitrate(), videoParam.getFps());
break;
case Surface.ROTATION_90:
degrees = 90;
pusherNative.setVideoOptions(videoParam.getWidth(), videoParam.getHeight(), videoParam.getBitrate(), videoParam.getFps());
break;
case Surface.ROTATION_180:
degrees = 180;
break;
case Surface.ROTATION_270:
degrees = 270;
break;
}
int result;
if (info.facing == Camera.CameraInfo.CAMERA_FACING_FRONT) {
result = (info.orientation + degrees) % 360;
result = (360 - result) % 360; // compensate the mirror
} else { // back-facing
result = (info.orientation - degrees + 360) % 360;
}
camera.setDisplayOrientation(result);
關(guān)于 setDisplayOrientation
的介紹:
This does not affect the order of byte array passed in [onPreviewFrame(byte[], Camera)](https://developer.android.com/reference/android/hardware/Camera.PreviewCallback.html#onPreviewFrame(byte[], android.hardware.Camera))
, JPEG pictures, or recorded video
“矯正”過的攝像頭雖然畫面能正常顯示了庶喜,但是我們通過camera.setPreviewCallbackWithBuffer(this)
監(jiān)聽攝像頭在預(yù)覽過程中的回調(diào)數(shù)據(jù)發(fā)現(xiàn)方向仍然是橫著的小腊,所以我們需要在回調(diào)里對數(shù)據(jù)再做一次矯正工作。
關(guān)于cameraInfo.orientation
的介紹:
The orientation of the camera image. The value is the angle that the camera image needs to be rotated clockwise so it shows correctly on the display in its natural orientation. It should be 0, 90, 180, or 270.
For example, suppose a device has a naturally tall screen. The back-facing camera sensor is mounted in landscape. You are looking at the screen. If the top side of the camera sensor is aligned with the right edge of the screen in natural orientation, the value should be 90. If the top side of a front-facing camera sensor is aligned with the right of the screen, the value should be 270.
大概的意思是:在豎屏狀態(tài)下久窟,后置相機拍的照片需要順時針旋轉(zhuǎn)90度秩冈,前置相機拍的照片需要順時針旋轉(zhuǎn)270度看起來才正常。
也就是說斥扛,當(dāng)手機方向為 Surface.ROTATION_0
(豎屏) 時入问,如果是前置相機,需要將 nv21 數(shù)據(jù)旋轉(zhuǎn) 270 度(逆時針 90 度)稀颁,如果是后置相機芬失,則需要將 nv21 數(shù)據(jù)旋轉(zhuǎn) 90 度(順時針 90 度),當(dāng)手機方向為 Surface.ROTATION_90
時匾灶,數(shù)據(jù)不需要處理棱烂,當(dāng)手機方向為 Surface.ROTATION_270
時,數(shù)據(jù)需要旋轉(zhuǎn) 180 度阶女。
以上面的 4*4 圖片為例颊糜, nv21 數(shù)據(jù)順時針旋轉(zhuǎn) 90度的格式前后對比:
前 | |||
---|---|---|---|
y1 | y2 | y3 | y4 |
y5 | y6 | y7 | y8 |
y9 | y10 | y11 | y12 |
y13 | y14 | y15 | y16 |
v1 | u1 | v2 | u2 |
v3 | u3 | v4 | u4 |
后 | |||
---|---|---|---|
y13 | y9 | y5 | y1 |
y14 | y10 | y6 | y2 |
y15 | y11 | y7 | y3 |
y16 | y12 | y8 | y4 |
v3 | u3 | v1 | u1 |
v4 | u4 | v2 | u2 |
nv21 中每個像素占用12位,一張圖片所占的內(nèi)存為 w * h * 1.5
:
// 開始預(yù)覽
private void startPreview() {
try {
camera = Camera.open(videoParam.getCameraId());
Camera.Parameters parameter = camera.getParameters();
parameter.setPreviewFormat(ImageFormat.NV21);
setPreviewSize(parameter);
camera.setParameters(parameter);
setDisplayOrientation();
int bitsPerPixel = ImageFormat.getBitsPerPixel(ImageFormat.NV21);
int bufferSize = videoParam.getWidth() * videoParam.getHeight() * bitsPerPixel / 8;
// 預(yù)覽緩沖區(qū)
buffer = new byte[bufferSize];
// 轉(zhuǎn)換方向后的緩沖區(qū)
rawBuffer = new byte[bufferSize];
camera.addCallbackBuffer(buffer);
camera.setPreviewCallbackWithBuffer(this);
camera.setPreviewDisplay(holder);
camera.startPreview();
} catch (Exception e) {
e.printStackTrace();
}
}
預(yù)覽數(shù)據(jù)秃踩,調(diào)用推送視頻的方法:
@Override
public void onPreviewFrame(byte[] bytes, Camera camera) {
if (isPushing) {
rotateNV21Data2Raw();
pusherNative.pushVideo(rawBuffer);
}
camera.addCallbackBuffer(bytes);
}
做視頻方向的矯正衬鱼,使客戶端接受到正常的畫面,視頻旋轉(zhuǎn)時憔杨,將 u/v 信息看成一個整體進行處理鸟赫。
private void rotateNV21Data2Raw() {
int w = videoParam.getWidth(), h = videoParam.getHeight(), y_size = w * h, k = 0;
// orientation = getWindowManager().getDefaultDisplay().getRotation ()
switch (orientation) {
case 0: {
if (videoParam.getCameraId() == Camera.CameraInfo.CAMERA_FACING_BACK) {
// 數(shù)據(jù)順時針旋轉(zhuǎn)90度
// y
for (int i = 0; i < w; i++) {
for (int j = h - 1; j >= 0; j--) {
rawBuffer[k++] = buffer[j * w + i];
}
}
// u/v
for (int i = 0; i < w; i += 2) {
for (int j = h / 2 - 1; j >= 0; j--) {
// v
rawBuffer[k++] = buffer[y_size + w * j + i];
// u
rawBuffer[k++] = buffer[y_size + w * j + i + 1];
}
}
} else {
// 數(shù)據(jù)逆時針旋轉(zhuǎn)90度
// y
for (int i = w - 1; i >= 0; i--) {
for (int j = 0; j < h; j++) {
rawBuffer[k++] = buffer[j * w + i];
}
}
// u/v
for (int i = w - 2; i >= 0; i -= 2) {
for (int j = 0; j < h / 2; j++) {
// v
rawBuffer[k++] = buffer[y_size + w * j + i];
// u
rawBuffer[k++] = buffer[y_size + w * j + i + 1];
}
}
}
}
break;
case 90: {
// 手機逆時針旋轉(zhuǎn)90度,數(shù)據(jù)不動
rawBuffer = buffer;
}
break;
case 270: {
// 手機順時針旋轉(zhuǎn)90度,數(shù)據(jù)順時針旋轉(zhuǎn)180度
// y
for (int i = y_size - 1; i >= 0; i--) {
rawBuffer[k++] = buffer[i];
}
// u/v
for (int i = y_size * 3 / 2 - 2; i >= y_size; i -= 2) {
// v
rawBuffer[k++] = buffer[i];
// u
rawBuffer[k++] = buffer[i + 1];
}
}
break;
}
}
“矯正”后的數(shù)據(jù)終于可以使用了嗎惯疙,no翠勉,no,no霉颠,我們還要將其轉(zhuǎn)換為 i420 格式供 x264 壓縮对碌,這塊的代碼交給 c++ 層處理吧。
// yuv data
x264_picture_t *x264_pic_in;
// y length = w * h
int y_len;
// u/v length = y_len / 2
int u_v_len;
// 傳入 java 層的 rawBuffer數(shù)據(jù)
jbyte *data = env->GetByteArrayElements(array, NULL);
memcpy(x264_pic_in->img.plane[0], data, y_len);
jbyte *u = (jbyte *) x264_pic_in->img.plane[1];
jbyte *v = (jbyte *) x264_pic_in->img.plane[2];
for (int i = 0; i < u_v_len; i++) {
*(u + i) = *(data + y_len + i * 2 + 1);
*(v + i) = *(data + y_len + i * 2);
}
H264結(jié)構(gòu)中蒿偎,一個視頻圖像編碼后的數(shù)據(jù)叫做一幀朽们,一幀由一個片或多個片組成,一個片由一個或多個宏塊組成诉位,一個宏塊由16x16的yuv數(shù)據(jù)組成骑脱。宏塊作為H264編碼的基本單位。
參考:
H264 在網(wǎng)絡(luò)傳輸?shù)氖?NALU苍糠,一幀圖片分成若干個 nal叁丧,將一個或多個 nal 打包成 RTMPPacket 壓到推送隊列,通過輪詢推送隊列的包岳瞭, rtmp將其放到上傳隊列推送到服務(wù)器端拥娄。
發(fā)送的包主要分為兩種:sps + pps 頭信息 和 幀數(shù)據(jù)
視頻包的格式:
pps 和 sps 的數(shù)據(jù)包格式:
示例代碼:
x264_nal_t *nal = NULL;
int n_nal = -1;
x264_picture_init(x264_pic_out);
if (x264_encoder_encode(x264, &nal, &n_nal, x264_pic_in, x264_pic_out) < 0) {
LOG_E("編碼失敗");
return;
}
x264_pic_in->i_pts++;
// x264 中 1f 固定值
uint8_t sps[100];
memset(sps, 0, 100);
int sps_len;
uint8_t pps[100];
memset(pps, 0, 100);
int pps_len;
for (int i = 0; i < n_nal; i++) {
if (nal[i].i_type == NAL_SPS) {
// 67 & 1f = NAL_SPS
// 前面4個字節(jié)是分隔符 0000 0001
sps_len = nal[i].i_payload - 4;
memcpy(sps, nal[i].p_payload + 4, sps_len);
} else if (nal[i].i_type == NAL_PPS) {
// 68 & 1f = NAL_PPS
pps_len = nal[i].i_payload - 4;
memcpy(pps, nal[i].p_payload + 4, pps_len);
// 發(fā)送 sps + pps 頭信息
add_264_sequence_header(pps, sps, pps_len, sps_len);
} else {
// 普通幀數(shù)據(jù)
add_264_body(nal[i].p_payload, nal[i].i_payload);
}
}
void add_264_sequence_header(uint8_t *pps, uint8_t *sps, int pps_len, int sps_len) {
int body_size = pps_len + sps_len + 16;
RTMPPacket *packet = (RTMPPacket *) malloc(sizeof(RTMPPacket));
RTMPPacket_Alloc(packet, body_size);
int i = 0;
unsigned char *body = (unsigned char *) packet->m_body;
body[i++] = 0x17;
body[i++] = 0x00;
body[i++] = 0x00;
body[i++] = 0x00;
body[i++] = 0x00;
// 版本號
body[i++] = 0x01;
// profile
body[i++] = sps[1];
// 兼容性
body[i++] = sps[2];
// 設(shè)置 profile level "baseline"
body[i++] = sps[3];
// 0xff 包長數(shù)據(jù)所使用的字節(jié)數(shù)
body[i++] = 0xff;
// sps 個數(shù)
body[i++] = 0xe1;
// sps 長度
body[i++] = (sps_len >> 8) & 0xff;
body[i++] = sps_len & 0xff;
// sps 內(nèi)容
memcpy(&body[i], sps, sps_len);
i += sps_len;
body[i++] = 0x01;
// pps 長度
body[i++] = (pps_len >> 8) & 0xff;
body[i++] = pps_len & 0xff;
// pps 內(nèi)容
memcpy(&body[i], pps, pps_len);
packet->m_packetType = RTMP_PACKET_TYPE_VIDEO;
packet->m_nBodySize = body_size;
packet->m_nTimeStamp = 0;
packet->m_hasAbsTimestamp = 0;
packet->m_nChannel = 0x04;
packet->m_headerType = RTMP_PACKET_SIZE_MEDIUM;
// 將 RTMP 包放入推送隊列
put(packet);
}
// 分隔符有兩種 00 00 00 01 和 00 00 01
void add_264_body(uint8_t *buf, int buf_len) {
// 去除分隔符,首地址偏移
if (buf[2] == 0x00) {
buf += 4;
buf_len -= 4;
} else if (buf[2] == 0x01) {
buf += 3;
buf_len -= 3;
}
int body_size = buf_len + 9;
RTMPPacket *packet = (RTMPPacket *) malloc(sizeof(RTMPPacket));
RTMPPacket_Alloc(packet, body_size);
char *body = packet->m_body;
// 判斷是否關(guān)鍵幀單元 buf[0] == 65
int type = buf[0] & 0x1f;
body[0] = 0x27;
if (type == NAL_SLICE_IDR) {
body[0] = 0x17;
}
int i = 1;
body[i++] = 0x01;
body[i++] = 0x00;
body[i++] = 0x00;
body[i++] = 0x00;
// 寫入 body 的長度
body[i++] = (buf_len >> 24) & 0xff;
body[i++] = (buf_len >> 16) & 0xff;
body[i++] = (buf_len >> 8) & 0xff;
body[i++] = buf_len & 0xff;
// 寫入 body 的內(nèi)容
memcpy(&body[i], buf, buf_len);
packet->m_packetType = RTMP_PACKET_TYPE_VIDEO;
packet->m_nBodySize = body_size;
packet->m_nTimeStamp = RTMP_GetTime() - start_time;
packet->m_hasAbsTimestamp = 0;
packet->m_nChannel = 0x04;
packet->m_headerType = RTMP_PACKET_SIZE_LARGE;
// 放到推流隊列
put(packet);
}
void put(RTMPPacket *packet) {
pthread_mutex_lock(&p_mutex);
if (publishing) {
queue.push(packet);
}
pthread_cond_signal(&p_cond);
pthread_mutex_unlock(&p_mutex);
}
RTMPPacket *get() {
pthread_mutex_lock(&p_mutex);
if (queue.empty()) {
pthread_cond_wait(&p_cond, &p_mutex);
}
RTMPPacket *packet = queue.front();
queue.pop();
pthread_mutex_unlock(&p_mutex);
return packet;
}
設(shè)置音頻選項
uint channels = _channels;
u_long sampleRate = _sampleRate;
audioHandle = faacEncOpen(sampleRate, channels, &input_samples, &max_output_bytes);
if (!audioHandle) {
// 失敗
LOG_I("打開音頻編碼器失敗")
}
faacEncConfigurationPtr configurationPtr = faacEncGetCurrentConfiguration(audioHandle);
configurationPtr->mpegVersion = MPEG4;
configurationPtr->allowMidside = 1;
configurationPtr->aacObjectType = LOW;
// 是否包含 ADTS 頭
configurationPtr->outputFormat = 0;
// 消除爆破聲瞳筏,短暫的噪聲
configurationPtr->useTns = 1;
configurationPtr->useLfe = 0;
configurationPtr->shortctl = SHORTCTL_NORMAL;
configurationPtr->inputFormat = FAAC_INPUT_16BIT;
configurationPtr->quantqual = 100;
configurationPtr->bandWidth = 0; // 頻寬
// 使上面的配置生效
if (!faacEncSetConfiguration(audioHandle, configurationPtr)) {
LOG_E("失敗");
}
LOG_I("聲音設(shè)置成功");
ADTS 為7個字節(jié)大小稚瘾,包含音頻的聲道數(shù),采樣率姚炕,幀大小等信息族壳,對于 RTMP 服務(wù)器則不需要這些信息这橙,所以封裝時不再需要陡厘。
音頻的錄制和推送:
// java 端 使用 audiorecord 錄制音頻
class AudioRecordRunnable implements Runnable {
@Override
public void run() {
audioRecord.startRecording();
while (isPushing) {
byte[] buffer = new byte[inputSamples * 2];
int read = audioRecord.read(buffer, 0, buffer.length);
if (read > 0) {
pusherNative.pushAudio(buffer, read);
}
}
audioRecord.stop();
}
}
// pushNative
jbyte *buf = env->GetByteArrayElements(buf_, NULL);
uint8_t *out_buf = (uint8_t *) malloc(sizeof(max_output_bytes));
// buf_size 為錄制的音頻字節(jié)數(shù)灭必,音頻采樣數(shù) = buf_size / 2,每個采樣為 16 位掸刊,占用2個字節(jié)
int out_buf_size = faacEncEncode(audioHandle, (int32_t *) buf, buf_size / 2, out_buf,
max_output_bytes);
if (out_buf_size > 0) {
add_aac_body(out_buf, out_buf_size);
}
env->ReleaseByteArrayElements(buf_, buf, 0);
free(out_buf);
在連接到RTMP服務(wù)器后摊沉,需要先發(fā)送音頻的采樣率信息頭:
void add_aac_sequence_header() {
if (!audioHandle) {
return;
}
unsigned char *buf;
u_long len; /* buf長度,一般是 2 */
faacEncGetDecoderSpecificInfo(audioHandle, &buf, &len);
RTMPPacket *packet = (RTMPPacket *) malloc(sizeof(RTMPPacket));
RTMPPacket_Alloc(packet, len + 2);
RTMPPacket_Reset(packet);
unsigned char *body = (unsigned char *) packet->m_body;
/*AF 00 + AAC RAW data*/
body[0] = 0xaf;
body[1] = 0x00;
memcpy(&body[2], buf, len); /*spec_buf 是 AAC sequence header 數(shù)據(jù)*/
packet->m_packetType = RTMP_PACKET_TYPE_AUDIO;
packet->m_nBodySize = len + 2;
packet->m_nChannel = 0x04;
packet->m_hasAbsTimestamp = 0;
packet->m_nTimeStamp = 0;
packet->m_hasAbsTimestamp = 0;
packet->m_headerType = RTMP_PACKET_SIZE_MEDIUM;
put(packet);
free(buf);
LOG_I("放入音頻編碼信息")
}
輪詢推送隊列,發(fā)送 RTMPPacket 信息:
RTMP *rtmp = RTMP_Alloc();
RTMP_Init(rtmp);
rtmp->Link.timeout = 5;
RTMP_SetupURL(rtmp, path);
RTMP_EnableWrite(rtmp);
if (!RTMP_Connect(rtmp, NULL)) {
LOG_E("連接失敗");
goto end;
}
// 連接流
RTMP_ConnectStream(rtmp, 0);
// 發(fā)送音頻采樣率的信息頭
add_aac_sequence_header();
while (publishing) {
RTMPPacket *packet = get();
// 推流
packet->m_nInfoField2 = rtmp->m_stream_id;
// 1 代表使用上傳隊列
bool rs = RTMP_SendPacket(rtmp, packet, 1);
LOG_I("推送 packet->%d", rs);
RTMPPacket_Free(packet);
free(packet);
}
end:
publishing = 0;
free(path);
RTMP_Close(rtmp);
RTMP_Free(rtmp);
pthread_exit(NULL);
發(fā)現(xiàn)有的手機會出現(xiàn)崩潰現(xiàn)象痒给,請關(guān)閉硬件加速后試試说墨。
推流 demo:https://gitee.com/chuanzhi/H264Pusher.git