CoreML 是 Apple 在 WWDC 2017 推出的機(jī)器學(xué)習(xí)框架盐类。但是其到底有什么功能呢寞奸,能不能識(shí)別花瓶呛谜,看看就知道了。
原文發(fā)表在個(gè)人博客iOS-CoreML-初探枪萄,轉(zhuǎn)載請(qǐng)注明出處隐岛。
模型
在 CoreML 中, Apple 定義了一套自己的模型格式瓷翻,后綴名為: mimodel聚凹,通過(guò) CoreML 框架,以及模型庫(kù)齐帚,可以在 App 層面進(jìn)行機(jī)器學(xué)習(xí)的功能研發(fā)妒牙。
官網(wǎng)已經(jīng)提供四個(gè)模型庫(kù)供下載。
Demo
官網(wǎng)提供了一個(gè) Demo童谒,要求 XCode 9 + iOS 11 的環(huán)境单旁。
下載下來(lái) Run 了一下,不得不說(shuō)饥伊,Apple 對(duì)開(kāi)發(fā)者還是非常友好的象浑,直接將模型文件拖到項(xiàng)目中,Xcode 會(huì)自動(dòng)生成接口文件:
import CoreML
class MarsHabitatPricerInput : MLFeatureProvider {
var solarPanels: Double
var greenhouses: Double
var size: Double
var featureNames: Set<String> {
get {
return ["solarPanels", "greenhouses", "size"]
}
}
func featureValue(for featureName: String) -> MLFeatureValue? {
if (featureName == "solarPanels") {
return MLFeatureValue(double: solarPanels)
}
if (featureName == "greenhouses") {
return MLFeatureValue(double: greenhouses)
}
if (featureName == "size") {
return MLFeatureValue(double: size)
}
return nil
}
init(solarPanels: Double, greenhouses: Double, size: Double) {
self.solarPanels = solarPanels
self.greenhouses = greenhouses
self.size = size
}
}
class MarsHabitatPricerOutput : MLFeatureProvider {
let price: Double
var featureNames: Set<String> {
get {
return ["price"]
}
}
func featureValue(for featureName: String) -> MLFeatureValue? {
if (featureName == "price") {
return MLFeatureValue(double: price)
}
return nil
}
init(price: Double) {
self.price = price
}
}
@objc class MarsHabitatPricer:NSObject {
var model: MLModel
init(contentsOf url: URL) throws {
self.model = try MLModel(contentsOf: url)
}
convenience override init() {
let bundle = Bundle(for: MarsHabitatPricer.self)
let assetPath = bundle.url(forResource: "MarsHabitatPricer", withExtension:"mlmodelc")
try! self.init(contentsOf: assetPath!)
}
func prediction(input: MarsHabitatPricerInput) throws -> MarsHabitatPricerOutput {
let outFeatures = try model.prediction(from: input)
let result = MarsHabitatPricerOutput(price: outFeatures.featureValue(for: "price")!.doubleValue)
return result
}
func prediction(solarPanels: Double, greenhouses: Double, size: Double) throws -> MarsHabitatPricerOutput {
let input_ = MarsHabitatPricerInput(solarPanels: solarPanels, greenhouses: greenhouses, size: size)
return try self.prediction(input: input_)
}
}
可以看到琅豆,主要是定義了輸入愉豺,輸出以及預(yù)測(cè)的格式,調(diào)用的時(shí)候茫因,也非常簡(jiǎn)單蚪拦,傳參即可。
但是這些接口文件并沒(méi)有在 XCode 左邊的文件樹(shù)中出現(xiàn)冻押。
查了一下驰贷,是生成在 DerivedData 目錄下,估計(jì)是想開(kāi)發(fā)者使用起來(lái)更簡(jiǎn)潔洛巢。
運(yùn)行一下括袒,可以看到,主要功能是對(duì)價(jià)格進(jìn)行預(yù)測(cè)稿茉。
貌似稍微有點(diǎn)不夠高大上...
Resnet50
官網(wǎng)提供的四個(gè)模型庫(kù)锹锰,我們還沒(méi)用呢,當(dāng)然要看下能用來(lái)干啥漓库,看了一下恃慧,貌似主要是物體識(shí)別,OK渺蒿,代碼走起痢士。
先下載模型庫(kù) Resnet50, 然后創(chuàng)建一個(gè)新的 Swift 項(xiàng)目,將其拖進(jìn)去:
從描述里面可以看出來(lái)茂装,其實(shí)一個(gè)神經(jīng)網(wǎng)絡(luò)的分類器良瞧,輸入是一張像素為 (224 * 224) 的圖片陪汽,輸出為分類結(jié)果训唱。
自動(dòng)生成的接口文件:
import CoreML
class Resnet50Input : MLFeatureProvider {
var image: CVPixelBuffer
var featureNames: Set<String> {
get {
return ["image"]
}
}
func featureValue(for featureName: String) -> MLFeatureValue? {
if (featureName == "image") {
return MLFeatureValue(pixelBuffer: image)
}
return nil
}
init(image: CVPixelBuffer) {
self.image = image
}
}
class Resnet50Output : MLFeatureProvider {
let classLabelProbs: [String : Double]
let classLabel: String
var featureNames: Set<String> {
get {
return ["classLabelProbs", "classLabel"]
}
}
func featureValue(for featureName: String) -> MLFeatureValue? {
if (featureName == "classLabelProbs") {
return try! MLFeatureValue(dictionary: classLabelProbs as [NSObject : NSNumber])
}
if (featureName == "classLabel") {
return MLFeatureValue(string: classLabel)
}
return nil
}
init(classLabelProbs: [String : Double], classLabel: String) {
self.classLabelProbs = classLabelProbs
self.classLabel = classLabel
}
}
@objc class Resnet50:NSObject {
var model: MLModel
init(contentsOf url: URL) throws {
self.model = try MLModel(contentsOf: url)
}
convenience override init() {
let bundle = Bundle(for: Resnet50.self)
let assetPath = bundle.url(forResource: "Resnet50", withExtension:"mlmodelc")
try! self.init(contentsOf: assetPath!)
}
func prediction(input: Resnet50Input) throws -> Resnet50Output {
let outFeatures = try model.prediction(from: input)
let result = Resnet50Output(classLabelProbs: outFeatures.featureValue(for: "classLabelProbs")!.dictionaryValue as! [String : Double], classLabel: outFeatures.featureValue(for: "classLabel")!.stringValue)
return result
}
func prediction(image: CVPixelBuffer) throws -> Resnet50Output {
let input_ = Resnet50Input(image: image)
return try self.prediction(input: input_)
}
}
OK褥蚯,要照片,而且是 CVPixelBuffer 類型的况增。
但是每次從相冊(cè)選太煩了赞庶,所以我們直接攝像頭走起。將 AVCam 的主要功能類復(fù)制到項(xiàng)目中澳骤。
然后歧强,禁用 CameraViewController 中一些不必要的按鈕:
self.recordButton.isHidden = true
self.captureModeControl.isHidden = true
self.livePhotoModeButton.isHidden = true
self.depthDataDeliveryButton.isHidden = true
由于,AVCapturePhotoCaptureDelegate 拍照完成的回調(diào)為:
func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: Error?)
看了下 AVCaputrePhoto 的定義为肮,里面剛好有 CVPixelBuffer 格式的屬性:
直接傳進(jìn)去試試:
// Predicte
if let pixelBuffer = photo.previewPixelBuffer {
guard let Resnet50CategoryOutput = try? model.prediction(image:pixelBuffer) else {
fatalError("Unexpected runtime error.")
}
}
一切看起來(lái)很完美摊册,編譯通過(guò),運(yùn)行起來(lái)颊艳,點(diǎn)一下拍照按鈕茅特,額,Crash了棋枕,異常:
[core] Error Domain=com.apple.CoreML Code=1 "Input image feature image does not match model description" UserInfo={NSLocalizedDescription=Input image feature image does not match model description, NSUnderlyingError=0x1c0643420 {Error Domain=com.apple.CoreML Code=1 "Image is not valid width 224, instead is 852" UserInfo={NSLocalizedDescription=Image is not valid width 224, instead is 852}}}
哦白修,忘記改大小了,找到 photoSetting重斑,加上寬高:
if !photoSettings.availablePreviewPhotoPixelFormatTypes.isEmpty {
photoSettings.previewPhotoFormat = [kCVPixelBufferPixelFormatTypeKey as String: photoSettings.availablePreviewPhotoPixelFormatTypes.first!,
kCVPixelBufferWidthKey as String : NSNumber(value:224),
kCVPixelBufferHeightKey as String : NSNumber(value:224)]
}
重新 Run兵睛,WTF,Man窥浪,居然又報(bào)同樣的錯(cuò)祖很,好吧,Google 一下漾脂,貌似寬高的屬性假颇,在 Swift 里面不生效,額符相。拆融。
沒(méi)辦法,那我們只能將 CVPixelBuffer 先轉(zhuǎn)換成 UIImage啊终,然后改下大小镜豹,再轉(zhuǎn)回 CVPixelBuffer,試試:
photoData = photo.fileDataRepresentation()
// Change Data to Image
guard let photoData = photoData else {
return
}
let image = UIImage(data: photoData)
// Resize
let newWidth:CGFloat = 224.0
let newHeight:CGFloat = 224.0
UIGraphicsBeginImageContext(CGSize(width:newWidth, height:newHeight))
image?.draw(in:CGRect(x:0, y:0, width:newWidth, height:newHeight))
let newImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
guard let finalImage = newImage else {
return
}
// Predicte
guard let Resnet50CategoryOutput = try? model.prediction(image:pixelBufferFromImage(image: finalImage)) else {
fatalError("Unexpected runtime error.")
}
重新 Run蓝牲,OK趟脂,一切很完美。
最后例衍,為了用戶體驗(yàn)昔期,加上攝像頭流的暫停和重啟已卸,免得在識(shí)別的時(shí)候,攝像頭還一直在動(dòng)硼一,另外累澡,識(shí)別結(jié)果通過(guò)提醒框彈出來(lái),具體參考文末的源碼般贼。
開(kāi)始玩啦愧哟,找支油筆試一下:
識(shí)別成,橡皮擦哼蛆,好吧蕊梧,其實(shí)是有點(diǎn)像。
再拿小綠植試試:
花瓶腮介,Are you kidding me ??
其實(shí)肥矢,效果還是蠻不錯(cuò)的。
剛好下午要去上海 CES Asia叠洗,一路拍過(guò)去玩甘改,想想都有點(diǎn)小激動(dòng)。
最后惕味,源碼奉上楼誓,想玩的同學(xué)直接下載編譯就行了,別忘了 Star~
看了又看:
深度學(xué)習(xí)是怎么識(shí)別人臉的?
300行代碼實(shí)現(xiàn)手寫漢字識(shí)別
如何在一周內(nèi)做一款拼音輸入法
iOS-簽名機(jī)制和證書原理
iOS-線程同步詳解