用Metal做計算（一）簡單的圖像處理

除了進行圖形渲染，我們還可以利用GPU硬件特點止毕，將一些在CPU上執(zhí)行起來很耗時的計算任務分配給GPU來完成(一些特定的計算任務炫欺，在GPU上快的真不是一點半點)。GPGPU Programming（General-purpose GPU Programming）的概念由來已久掰读，但在使用OpenGL與GPU打交道時，我們只能用比較隱蔽的方式來實踐叭莫，比如將我們想執(zhí)行的計算任務嵌入到圖形渲染管線當中蹈集。但有了Metal，我們就不需要這么拐彎抹角了雇初。Metal提供了專門的計算管線拢肆，讓我們可以用更加直接，易讀的代碼調度GPU來執(zhí)行計算任務靖诗。接下來用一個簡單的例子（調整圖片的飽和度）來一起學習一下郭怪，如何使用Metal做計算。

Metal中的一些基本概念

在開始代碼開發(fā)之前刊橘，我們首先對Metal中的基本類和概念進行下簡單的回顧鄙才。包括：

MTLDevice
MTLCommandQueue
MTLCommandBuffer
MTLCommandEncoder
MTLCommand
MTLComputePipelineState & MTLLibrary & MTLFunction

乍一看，涉及到的概念比較多促绵，但實際開發(fā)起來攒庵，這些類的串聯(lián)方式還是很直觀的，下面來一張圖來整理一下

Metal Compute Graph.png

在初始化階段败晴，我們需要獲得一個MTLDevice實例（可以理解它是GPU的操作接口）叙甸，然后由Device創(chuàng)建一個MTLCommandQueue（所有像GPU發(fā)送的指令都需要首先放到CommandQueue當中）。另外位衩，需要創(chuàng)建一個MTLLibrary對象（我的理解就是這里包含了編譯好的shader方法），然后由Library獲得用來描述具體計算任務的MTLFunction對象熔萧，再用Function對象來創(chuàng)建一個MTLComputePipelineState（類似渲染管線一樣的東西糖驴，我們稱為計算管線吧）。

在運行階段佛致，我們首先需要使用CommandQueue創(chuàng)建一個CommandBuffer出來贮缕，然后用CommandBuffer創(chuàng)建一個CommandEncoder，用來向CommandBuffer中寫入指令俺榆。指令寫入完成之后感昼，調用CammandBuffer的commit方法，提交計算任務給GPU罐脊。

Talk is cheap

下面開始貼代碼

    guard let device = MTLCreateSystemDefaultDevice() else  {
        return nil
    }
    guard let commandQueue = device.makeCommandQueue() else {
        return nil
    }
    
    guard let library = device.makeDefaultLibrary() else {
        return nil
    }
    guard let kernelFunction = library.makeFunction(name: "adjust_saturation") else {
        return nil
    }
    
    let computePipelineState: MTLComputePipelineState
    do {
        computePipelineState = try device.makeComputePipelineState(function: kernelFunction)
    } catch let _ {
        return nil
    }

在這段代碼里定嗓，依次創(chuàng)建了MTLDevice蜕琴，MTLCommandQueue，MTLLibrary宵溅，MTLFunction凌简，MTLComputePipelineState等對象。

在創(chuàng)建MTLFunction實例的時用到的 adjust_saturation 是定義在.metal文件中的shader方法恃逻，方法內容如下：

kernel void adjust_saturation(texture2d<float, access::read> inTexture[[texture(0)]],
                              texture2d<float, access::write> outTexture[[texture(1)]],
                              constant float* saturation [[buffer(0)]],
                              uint2 gid [[thread_position_in_grid]]) {
    float4 inColor = inTexture.read(gid);
    float value = dot(inColor.rgb, float3(0.299, 0.587, 0.114));
    float4 grayColor(value, value, value, 1.0);
    float4 outColor = mix(grayColor, inColor, *saturation);
    outTexture.write(outColor, gid);
}

這個方法的參數(shù)有兩張texture（一張用來做輸入雏搂，另外一張做輸出），一個float類型的參數(shù)寇损，作為飽和度計算參數(shù)以及標記為 [[thread_position_in_grid]]的gid參數(shù)凸郑，暫時認為gid標記了本次計算在整個計算任務當中的id。

關于kernel方法內部的實現(xiàn)矛市，這里就不多講了芙沥，大致上是使用輸入紋理中一個像素點的RGB值計算出它的灰度值，再根據(jù)saturation參數(shù)按一定比例混合彩色值與灰度值尘盼，輸出一個飽和度修改后的結果憨愉，寫入輸出紋理當中。

接下來是執(zhí)行計算相關的代碼

    // prepare input texture
    let cmImage = cmImageFromUIImage(uiImage: image) // 自定義方法卿捎，從UIImage對象加載圖片數(shù)據(jù)
    let textureDescriptor = MTLTextureDescriptor()
    textureDescriptor.width = cmImage.width
    textureDescriptor.height = cmImage.height
    textureDescriptor.pixelFormat = MTLPixelFormat.bgra8Unorm
    textureDescriptor.usage = .shaderRead
    let inTexture = device.makeTexture(descriptor: textureDescriptor)!
    let region = MTLRegion(origin: MTLOrigin(x: 0, y: 0, z: 0), size: MTLSize(width: cmImage.width, height: cmImage.height, depth: 1))
    inTexture.replace(region: region, mipmapLevel: 0, withBytes: NSData(data: cmImage.data!).bytes, bytesPerRow: cmImage.width * 4)
    
    // prepare output texture
    let outTextureDescriptor = MTLTextureDescriptor()
    outTextureDescriptor.width = cmImage.width
    outTextureDescriptor.height = cmImage.height
    outTextureDescriptor.pixelFormat = MTLPixelFormat.bgra8Unorm
    outTextureDescriptor.usage = MTLTextureUsage.shaderWrite
    let outTexture = device.makeTexture(descriptor: outTextureDescriptor)!
    
    guard let commandBuffer = commandQueue.makeCommandBuffer() else {
        return nil
    }
    
    guard let commandEncorder = commandBuffer.makeComputeCommandEncoder() else {
        return nil
    }
    
    commandEncorder.setComputePipelineState(computePipelineState)
    commandEncorder.setTexture(inTexture, index: 0)
    commandEncorder.setTexture(outTexture, index: 1)
    var saturation: float_t = 0.1
    commandEncorder.setBytes(&saturation, length: MemoryLayout<float_t>.size, index: 0)
    
    let width = cmImage.width
    let height = cmImage.height
    
    let groupSize = 16
    let groupCountWidth = (width + groupSize) / groupSize - 1
    let groupCountHeight = (height + groupSize) / groupSize - 1
    
    commandEncorder.dispatchThreadgroups(MTLSize(width: groupCountWidth, height: groupCountHeight, depth: 1), threadsPerThreadgroup: MTLSize(width: groupSize, height: groupSize, depth: 1))

    commandEncorder.endEncoding()

    commandBuffer.commit()

首先準備好兩個MTLTexture對象配紫，用來做計算的輸入和輸出。
然后創(chuàng)建CommandBuffer和CommandEncoder對象午阵，用CommandEncoder對象配置計算管線躺孝，配置kernel方法的輸入（inTexture, outTexture, saturation 等）。
最后通過dispatchThreadgroups方法底桂，將計算任務分發(fā)到GPU植袍。這里引入了Metal Compute中的另外的三個概念：

thread
thread group
grid size

首先，關于grid size

A compute pass must specify the number of times to execute a kernel function. This number corresponds to the grid size, which is defined in terms of threads and threadgroups.

即籽懦，grid size定義了一次GPU的compute pass里于个，shader方法需要執(zhí)行的總次數(shù)。grid size使用MTLSize數(shù)據(jù)結構來定義暮顺，包含三個分量厅篓，在本例當中，grid size為（imageWidth, imageHeight, 1）捶码。同時羽氮，根據(jù)文檔的描述，我們不會直接去設置grid size惫恼，而是通過設置thread group size和thread group counts的方式來間接設置grid size鸟蜡。

關于 thread group size / thread group count

A threadgroup is a 3D group of threads that are executed concurrently by a kernel function.

thread group size定義了一次有多少計算被并行執(zhí)行兽泄。thread group size的最大值和GPU硬件有關询吴，在本例當中我們使用（16验懊， 16，1），即一次有256個計算任務被并行執(zhí)行。根據(jù)圖片的分辨，我們可以計算得到thread group count旬蟋。

最后，我們可以在GPU計算完成后革娄，從outTexture獲得計算結果倾贰，再轉換成UIImage對象。

    commandBuffer.waitUntilCompleted()
    
    // create image from out texture
    let imageBytes = UnsafeMutablePointer<UInt8>.allocate(capacity: cmImage.width * cmImage.height * 4)
    outTexture.getBytes(imageBytes, bytesPerRow: cmImage.width * 4, from: region, mipmapLevel: 0)
    
    let context = CGContext(data: imageBytes, width: cmImage.width, height: cmImage.height, bitsPerComponent: 8, bytesPerRow: cmImage.width * 4, space: CGColorSpaceCreateDeviceRGB(), bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue)!
    let cgImage = context.makeImage()!
    return UIImage(cgImage: cgImage, scale: 1.0, orientation: UIImageOrientation.downMirrored)

UIImage --> MTLTexture

示例代碼中拦惋，使用了一個自定義的方法從UIImage對象中獲取像素數(shù)據(jù)匆浙，下面把相關代碼貼出來，僅供參考

class CMImage: NSObject {
    var width: Int = 0
    var height: Int = 0
    var data: Data?
}

func cmImageFromUIImage(uiImage: UIImage) -> CMImage {
    let image = CMImage()
    image.width = Int(uiImage.size.width)
    image.height = Int(uiImage.size.height)
    
    let bytes = UnsafeMutablePointer<UInt8>.allocate(capacity: image.width * image.height * 4)
    let context = CGContext(data: bytes, width: image.width, height: image.height, bitsPerComponent: 8, bytesPerRow: image.width * 4, space: CGColorSpaceCreateDeviceRGB(), bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue)
    context?.translateBy(x: 0, y: uiImage.size.height)
    context?.scaleBy(x: 1, y: -1)
    context?.draw(uiImage.cgImage!, in: CGRect(x: 0, y: 0, width: uiImage.size.width, height: uiImage.size.height))
    image.data = Data(bytes: bytes, count: image.width * image.height * 4)
    
    return image
}

寫在最后

為了圖方便厕妖，在本例中首尼，將Init Phase和Compute Pass相關的代碼都塞入了一個方法當中，但根據(jù)蘋果的最佳實踐文檔言秸，Device软能， Library，CommandQueue举畸，ComputePipeline等對象應當僅在App的初始化過程中創(chuàng)建一次查排，而不是每次執(zhí)行計算都重復創(chuàng)建。

以上僅能算作Metal計算方面的Hello World抄沮，后面還有很多的內容值得我們去深入學習跋核，感興趣的朋友們一起加油吧！

?著作權歸作者所有,轉載或內容合作請聯(lián)系作者

人面猴
序言：七十年代末叛买，一起剝皮案震驚了整個濱河市砂代，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌率挣，老刑警劉巖刻伊，帶你破解...
沈念sama閱讀 219,539評論 6贊 508
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異椒功，居然都是意外死亡捶箱，警方通過查閱死者的電腦和手機，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 93,594評論 3贊 396
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門蛾茉，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人撩鹿，你說我怎么就攤上這事谦炬。” “怎么了？”我有些...
開封第一講書人閱讀 165,871評論 0贊 356
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵键思，是天一觀的道長础爬。經常有香客問我，道長吼鳞，這世上最難降的妖魔是什么看蚜？我笑而不...
開封第一講書人閱讀 58,963評論 1贊 295
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮赔桌，結果婚禮上供炎，老公的妹妹穿的比我還像新娘。我一直安慰自己疾党，他們只是感情好音诫，可當我...
茶點故事閱讀 67,984評論 6贊 393
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著雪位，像睡著了一般竭钝。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上雹洗，一...
開封第一講書人閱讀 51,763評論 1贊 307
城市分裂傳說
那天香罐，我揣著相機與錄音，去河邊找鬼时肿。笑死庇茫，一個胖子當著我的面吹牛，可吹牛的內容都是我干的嗜侮。我是一名探鬼主播港令，決...
沈念sama閱讀 40,468評論 3贊 420
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼锈颗！你這毒婦竟也來了顷霹？” 一聲冷哼從身側響起，我...
開封第一講書人閱讀 39,357評論 0贊 276
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤击吱，失蹤者是張志新（化名）和其女友劉穎淋淀，沒想到半個月后，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體覆醇，經...
沈念sama閱讀 45,850評論 1贊 317
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡朵纷，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內容為張勛視角年9月15日...
茶點故事閱讀 38,002評論 3贊 338
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現(xiàn)自己被綠了永脓。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片袍辞。...
茶點故事閱讀 40,144評論 1贊 351
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖常摧，靈堂內的尸體忽然破棺而出搅吁，到底是詐尸還是另有隱情威创，我是刑警寧澤，帶...
沈念sama閱讀 35,823評論 5贊 346
?日本核電站爆炸內幕
正文年R本政府宣布谎懦，位于F島的核電站肚豺，受9級特大地震影響，放射性物質發(fā)生泄漏界拦。R本人自食惡果不足惜吸申，卻給世界環(huán)境...
茶點故事閱讀 41,483評論 3贊 331
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望享甸。院中可真熱鬧截碴，春花似錦、人聲如沸枪萄。這莊子的主人今日做“春日...
開封第一講書人閱讀 32,026評論 0贊 22
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽瓷翻。三九已至聚凹，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間齐帚，已是汗流浹背妒牙。一陣腳步聲響...
開封第一講書人閱讀 33,150評論 1贊 272
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留对妄，地道東北人湘今。一個月前我還...
沈念sama閱讀 48,415評論 3贊 373
代替公主和親
正文我出身青樓，卻偏偏與公主長得像剪菱，于是被迫代替她去往敵國和親摩瞎。傳聞我的和親對象是個殘疾皇子，可洞房花燭夜當晚...
茶點故事閱讀 45,092評論 2贊 355

用Metal做計算（一） 簡單的圖像處理

Metal中的一些基本概念

Talk is cheap

UIImage --> MTLTexture

寫在最后

推薦閱讀更多精彩內容

用Metal做計算（一）簡單的圖像處理