Metal學(xué)習(xí)筆記（六） -- Metal渲染視頻

除了渲染攝像頭采集數(shù)據(jù)唉擂，我們還可以通過Metal渲染視頻文件想帅。不同的是睁本，視頻文件經(jīng)過編碼尿庐，并且采用的是YUV顏色空間，所以除了解碼呢堰，我們還需要矩陣將YUV轉(zhuǎn)化為RGB顏色空間抄瑟。

基本思路

采用AVFoundation的AVAssetReader解碼視頻文件獲得CMSampleBufferRef，再通過CoreVideo轉(zhuǎn)化得到MTLTexture對象（YUV）枉疼，最后將MTLTexture和YUV轉(zhuǎn)RGB的矩陣傳入Metal皮假，完成渲染。

思維導(dǎo)圖.png

視頻解碼

關(guān)于AVAssetReader骂维，可以通過蘋果官方文檔得知惹资，它一個用來獲得視頻數(shù)據(jù)的工具類。

AVAssetReader lets you:

Read raw un-decoded media samples directly from storage, obtain samples decoded into renderable forms.

Mix multiple audio tracks of the asset and compose multiple video tracks by using AVAssetReaderAudioMixOutput and AVAssetReaderVideoCompositionOutput.

The AVAssetReader pipelines are multithreaded internally. After you initiate reading with initWithAsset:error:, a reader loads and processes a reasonable amount of sample data ahead of use so that retrieval operations such as copyNextSampleBuffer (AVAssetReaderOutput) can have very low latency. AVAssetReader is not intended for use with real-time sources, and its performance is not guaranteed for real-time operations.

由于本文不對音頻做探究航闺，所以只獲取視頻軌道數(shù)據(jù)褪测。

@implementation LJAssetReader {
    AVAssetReaderTrackOutput *readerVideoTrackOutput;
    AVAssetReader *assetReader;
    NSURL *videoUrl;
    NSLock *lock;
}

- (instancetype)initWithUrl:(NSURL *)url {
    if (self = [super init]) {
        videoUrl = url;
        lock = [[NSLock alloc] init];
        [self setupAsset];
    }
    return self;
}

- (void)setupAsset {
    NSDictionary *inputOption = @{AVURLAssetPreferPreciseDurationAndTimingKey: @(YES)};
    
    AVURLAsset *inputAsset = [[AVURLAsset alloc] initWithURL:videoUrl options:inputOption];
    __weak typeof(self) weakSelf = self;
    
    NSString *tracks = @"tracks";
    
    [inputAsset loadValuesAsynchronouslyForKeys:@[tracks] completionHandler:^{
        __strong typeof(self) strongSelf = weakSelf;
        
        dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
            NSError *error = nil;
            AVKeyValueStatus trackStatus = [inputAsset statusOfValueForKey:@"tracks" error:&error];
            if (trackStatus != AVKeyValueStatusLoaded) {
                NSLog(@"error:%@", error);
                return;
            }
            
            [weakSelf processWithAsset:inputAsset];
            
            
        });
    }];
}

- (void)processWithAsset:(AVAsset *)asset {
    [lock lock];
    NSLog(@"processWithAsset");
    
    NSError *error = nil;
    
    assetReader = [AVAssetReader assetReaderWithAsset:asset error:&error];
    
    NSMutableDictionary *outputSettings = [NSMutableDictionary dictionary];
    [outputSettings setObject:@(kCVPixelFormatType_420YpCbCr8BiPlanarFullRange) forKey:(id)kCVPixelBufferPixelFormatTypeKey];
    
    readerVideoTrackOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:[[asset tracksWithMediaType:AVMediaTypeVideo] firstObject] outputSettings:outputSettings];
    
    readerVideoTrackOutput.alwaysCopiesSampleData = NO;
    
    [assetReader addOutput:readerVideoTrackOutput];
    
    if ([assetReader startReading] == NO) {
        NSLog(@"error reading");
    }
    
    [lock unlock];
}

- (CMSampleBufferRef)readBuffer {
    [lock lock];
    
    CMSampleBufferRef sampleBuffer = nil;
    
    if (readerVideoTrackOutput) {
        sampleBuffer = [readerVideoTrackOutput copyNextSampleBuffer];
    }
    
    if (assetReader && assetReader.status == AVAssetReaderStatusCompleted) {
        NSLog(@"customInit");
        
        readerVideoTrackOutput = nil;
        assetReader = nil;
        
        [self setupAsset];
    }
    
    [lock unlock];
    
    return  sampleBuffer;
}

@end

AVAssetReader的使用步驟為，將AVURLAsset作為AVAssetReader的輸入源獲取視頻源數(shù)據(jù)潦刃，再通過AVAssetReaderTrackOutput作為AVAssetReader的輸出端口并通過copyNextSampleBuffer獲得CMSampleBufferRef侮措。

需要注意的是AVAssetReaderTrackOutput的輸出設(shè)置里將輸出格式設(shè)置為kCVPixelFormatType_420YpCbCr8BiPlanarFullRange，則表示輸出采用的是4:2:0的YUV顏色空間格式福铅，并且采用的是雙平面萝毛，即Y通道一個平面，UV通道一個平面滑黔，顏色范圍為更多的FullRange笆包，這個設(shè)置至關(guān)重要，關(guān)系著Metal獲取紋素的計算方式略荡。

Metal配置

關(guān)于Metal的配置庵佣，這里就不再贅述，直接上代碼汛兜。

- (void)setupMetal {
    _mtkView = [[MTKView alloc] initWithFrame:self.view.bounds device:MTLCreateSystemDefaultDevice()];
    
    if (!_mtkView.device) {
        NSLog(@"not device");
        return;
    }
    
    [self.view addSubview:_mtkView];
    
    _mtkView.delegate = self;
    
    self.viewportSize = (vector_uint2){self.mtkView.drawableSize.width, self.mtkView.drawableSize.height};
}

- (void)setupPipeline {
    id<MTLLibrary> defaultLibrary = [self.mtkView.device newDefaultLibrary];
    
    id<MTLFunction> vertexFunction = [defaultLibrary newFunctionWithName:@"vertexShader"];
    id<MTLFunction> fragmentFunction = [defaultLibrary newFunctionWithName:@"fragmentShader"];
    
    MTLRenderPipelineDescriptor *pipelineDesc = [[MTLRenderPipelineDescriptor alloc] init];
    pipelineDesc.label = @"my pipeline desc";
    pipelineDesc.vertexFunction = vertexFunction;
    pipelineDesc.fragmentFunction = fragmentFunction;
    pipelineDesc.colorAttachments[0].pixelFormat = self.mtkView.colorPixelFormat;
    
    NSError *error = nil;
    _pipeline = [self.mtkView.device newRenderPipelineStateWithDescriptor:pipelineDesc error:&error];
    
    if (error) {
        NSLog(@"pipeline create error: %@", error.localizedDescription);
        return;
    }
    
    _commandQueue = [self.mtkView.device newCommandQueue];
}

而Metal的片元著色器函數(shù)則需要傳入兩個紋理（Y通道紋理和UV通道紋理）和一個轉(zhuǎn)化矩陣巴粪，代碼如下：

#include <metal_stdlib>
#import "LJShaderTypes.h"
using namespace metal;

typedef struct
{
    float4 clipSpacePosition [[position]];
    float2 textureCoord;
} RasteizerData;

vertex RasteizerData
vertexShader(uint vertexID [[vertex_id]],
             constant LJVertex *vertexArray [[buffer(LJVertexInputIndexVertices)]])
{
    RasteizerData out;
    out.clipSpacePosition = vertexArray[vertexID].position;
    out.textureCoord = vertexArray[vertexID].textureCoord;
    return out;
}

fragment float4 fragmentShader(RasteizerData input [[stage_in]],
                               texture2d<float> textureY [[texture(LJFragmentTextureIndexTextureY)]],
                               texture2d<float> textureUV [[texture(LJFragmentTextureIndexTextureUV)]],
                               constant LJConvertMatrix *convertMatrix [[buffer(LJFragmentBufferIndexMatrix)]])
{
    
    constexpr sampler textureSampler(mag_filter::linear, min_filter::linear);
    float3 yuv = float3(textureY.sample(textureSampler, input.textureCoord).r, textureUV.sample(textureSampler, input.textureCoord).rg);
    float3 rgb = convertMatrix->matrix * (yuv + convertMatrix->offset);
    
    return  float4(rgb, 1.0);
}

附上Metal和app共有文件的代碼

#ifndef LJShaderTypes_h
#define LJShaderTypes_h

#include <simd/simd.h>

typedef struct {
    vector_float4 position;
    vector_float2 textureCoord;
}LJVertex;

typedef struct {
    matrix_float3x3  matrix;
    vector_float3 offset;
}LJConvertMatrix;

typedef enum {
    LJVertexInputIndexVertices = 0,
}LJVertexInputIndex;

typedef enum {
    LJFragmentBufferIndexMatrix = 0,
}LJFragmentBufferIndex;

typedef enum {
    LJFragmentTextureIndexTextureY = 0,
    LJFragmentTextureIndexTextureUV = 1,
}LJFragmentTextureIndex;

#endif /* LJShaderTypes_h */

準備頂點和轉(zhuǎn)換矩陣

- (void)setupVertices {
    static const LJVertex quardVertices[] = {
        { {  1.0, -1.0, 0.0, 1.0 },  { 1.f, 1.f } },
        { { -1.0, -1.0, 0.0, 1.0 },  { 0.f, 1.f } },
        { { -1.0,  1.0, 0.0, 1.0 },  { 0.f, 0.f } },
        
        { {  1.0, -1.0, 0.0, 1.0 },  { 1.f, 1.f } },
        { { -1.0,  1.0, 0.0, 1.0 },  { 0.f, 0.f } },
        { {  1.0,  1.0, 0.0, 1.0 },  { 1.f, 0.f } },
    };
    
    _vertices = [self.mtkView.device newBufferWithBytes:quardVertices length:sizeof(quardVertices) options:MTLResourceStorageModeShared];
    
    _numVertices = sizeof(quardVertices) / sizeof(LJVertex);
}

- (void)setupMatrix {
    //1.轉(zhuǎn)化矩陣
     // BT.601, which is the standard for SDTV.
     matrix_float3x3 kColorConversion601DefaultMatrix = (matrix_float3x3){
         (simd_float3){1.164,  1.164, 1.164},
         (simd_float3){0.0, -0.392, 2.017},
         (simd_float3){1.596, -0.813,   0.0},
     };
     
     // BT.601 full range
     matrix_float3x3 kColorConversion601FullRangeMatrix = (matrix_float3x3){
         (simd_float3){1.0,    1.0,    1.0},
         (simd_float3){0.0,    -0.343, 1.765},
         (simd_float3){1.4,    -0.711, 0.0},
     };
    
     // BT.709, which is the standard for HDTV.
     matrix_float3x3 kColorConversion709DefaultMatrix[] = {
         (simd_float3){1.164,  1.164, 1.164},
         (simd_float3){0.0, -0.213, 2.112},
         (simd_float3){1.793, -0.533,   0.0},
     };
     //2.偏移量
    vector_float3 kColorConversion601FullRangeOffset = (vector_float3){ -(16.0/255.0), -0.5, -0.5};
    
    LJConvertMatrix matrix;
    
    matrix.matrix = kColorConversion601FullRangeMatrix;
    
    matrix.offset = kColorConversion601FullRangeOffset;
    
    _convertMatrix = [self.mtkView.device newBufferWithBytes:&matrix length:sizeof(matrix) options:MTLResourceStorageModeShared];
}

YUV轉(zhuǎn)RGB的矩陣有3種，這里采用了BT.601 full range粥谬。

開始渲染

- (void)mtkView:(MTKView *)view drawableSizeWillChange:(CGSize)size {
    _viewportSize = (vector_uint2){size.width, size.height};
}

- (void)drawInMTKView:(MTKView *)view {
    id<MTLCommandBuffer> commandBuffer = [self.commandQueue commandBuffer];
    commandBuffer.label = @"my commadn buffer";
    
    MTLRenderPassDescriptor *renderPassDesc = view.currentRenderPassDescriptor;
    
    CMSampleBufferRef sampleBuffer = [self.reader readBuffer];
    
    if (renderPassDesc && sampleBuffer) {
        renderPassDesc.colorAttachments[0].clearColor = MTLClearColorMake(0.5, 0.5, 0.5, 1.0);
        
        id<MTLRenderCommandEncoder> commandEncoder = [commandBuffer renderCommandEncoderWithDescriptor:renderPassDesc];
        
        [commandEncoder setRenderPipelineState:self.pipeline];
        
        [commandEncoder setViewport:(MTLViewport){0.0, 0.0, self.viewportSize.x, self.viewportSize.y, -1.0, 1.0}];
        
        [commandEncoder setVertexBuffer:self.vertices offset:0 atIndex:LJVertexInputIndexVertices];
        
        [self setupTextureWithEncoder:commandEncoder buffer:sampleBuffer];
        
        [commandEncoder setFragmentBuffer:self.convertMatrix offset:0 atIndex:LJFragmentBufferIndexMatrix];
        
        [commandEncoder drawPrimitives:MTLPrimitiveTypeTriangle vertexStart:0 vertexCount:self.numVertices];
        
        [commandEncoder endEncoding];
        
        [commandBuffer presentDrawable:view.currentDrawable];
    }
    
    [commandBuffer commit];
}

這部分代碼只是常規(guī)的渲染肛根，關(guān)鍵點在于setupTextureWithEncoder:buffer:，代碼如下

- (void)setupTextureWithEncoder:(id<MTLRenderCommandEncoder>)encoder buffer:(CMSampleBufferRef)samplerBuffer {

    CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(samplerBuffer);
    
    id<MTLTexture> textureY = nil;
    id<MTLTexture> textureUV = nil;
    
    {
        size_t width  = CVPixelBufferGetWidthOfPlane(pixelBuffer, 0);
        size_t height = CVPixelBufferGetHeightOfPlane(pixelBuffer, 0);
        
        MTLPixelFormat pixelFormat = MTLPixelFormatR8Unorm;
        
        CVMetalTextureRef temTexture = nil;
        
        CVReturn status = CVMetalTextureCacheCreateTextureFromImage(NULL, self.textureCache, pixelBuffer, NULL, pixelFormat, width, height, 0, &temTexture);
        
        if (status == kCVReturnSuccess) {
            textureY = CVMetalTextureGetTexture(temTexture);
            
            CFRelease(temTexture);
        }
    }
    
    {
        size_t width = CVPixelBufferGetWidthOfPlane(pixelBuffer, 1);
        size_t height = CVPixelBufferGetHeightOfPlane(pixelBuffer, 1);
        MTLPixelFormat pixelFormat = MTLPixelFormatRG8Unorm;
        CVMetalTextureRef tmpTexture = NULL;
        CVReturn status = CVMetalTextureCacheCreateTextureFromImage(NULL, self.textureCache, pixelBuffer, NULL, pixelFormat, width, height, 1, &tmpTexture);
        if (status == kCVReturnSuccess) {
            textureUV = CVMetalTextureGetTexture(tmpTexture);
            CFRelease(tmpTexture);
        }
    }
    
    if (textureY != nil && textureUV != nil) {
        [encoder setFragmentTexture:textureY atIndex:LJFragmentTextureIndexTextureY];
        [encoder setFragmentTexture:textureUV atIndex:LJFragmentTextureIndexTextureUV];
    }
    
    CFRelease(samplerBuffer);
    
}

因為在前面我們設(shè)置視頻流輸出格式為kCVPixelFormatType_420YpCbCr8BiPlanarFullRange漏策，所以CVPixelBufferRef有兩個平面派哲，我們可以通過CVMetalTextureCacheCreateTextureFromImage函數(shù)將planeIndex參數(shù)設(shè)置為0或1獲取不同平面的紋理，另外因為YUV是4:2:0的關(guān)系掺喻，所以兩個平面的寬高并不一致（Y平面的寬高是UV平面寬高的2倍）芭届，我們需要使用CVPixelBufferGetWidthOfPlane獲取不同平面的寬高。

最后附上demo代碼