iOS - 蘋果的Vision綜合實踐

iOS11引入的Vision Framework，這幾年在不斷的添加新能力。今年新增的能力比較多有滑，蘋果直接整了一個完整的App和文檔，來綜合的展示新的視覺能力嵌削，文章本身是對這個demo App和文檔的~~解讀和~~翻譯毛好，demo app的鏈接在文末。

丟沙包游戲

這個游戲貌似是老美很常見的一個游戲苛秕，主要元素是：

一個人肌访，幾個沙包
一個帶有洞的木板（2*4英尺大小的木板，6英寸直徑的洞）
游戲規(guī)則就是人站在25英尺外扔沙包
碰到木板或者進洞就算得分

Action & Vision

蘋果以丟沙包這個游戲為例艇劫，寫了一個名字叫Action & Vision的App吼驶，通過iOS的視覺能力來幫助用戶分析游戲行為和統(tǒng)計分數(shù)。實際運行效果如下。

Action & Vision

業(yè)務流程和代碼

Vision基礎
所有Vision能力蟹演，命名都是一個VNXXXRequest风钻，比如VNDetectHumanBodyPoseRequest。每當我們需要使用一個能力酒请，我們需要創(chuàng)建一個request骡技，并且使用一個VNImageRequestHandler，來輸入圖片羞反，以及需要使用的[request]能力數(shù)組布朦，最后，我們調(diào)用handle.perform(request)苟弛，來執(zhí)行實際的預測喝滞，并沖request中獲取結(jié)果，即一個observation.

// 創(chuàng)建request
private let detectPlayerRequest = VNDetectHumanBodyPoseRequest()
// 輸入圖片并執(zhí)行request
let visionHandler = VNImageRequestHandler(cmSampleBuffer: buffer, orientation: orientation, options: [:])
// 獲取結(jié)果
let results = self.detectTrajectoryRequest.results as? [VNTrajectoryObservation]

游戲前的定位和分析膏秫。
1.1 首先需要識別木板的位置右遭，蘋果使用了Create ML自己創(chuàng)建了一個模型，來識別木板缤削。
1.2 然后通過VNDetectContoursRequest窘哈，來確定木板的邊緣，確定像素和實際物體長寬的對應關(guān)系亭敢，來確定洞在木板和圖片中的位置滚婉。因為在1.1中已經(jīng)識別了木板，實際使用VNDetectContoursRequest的過程是帅刀，把1.1中識別的木板的bouding box拿到让腹，在使用VNDetectContoursRequest時指定regionOfInterest，減少噪聲提高效率扣溺。

        let contoursRequest = VNDetectContoursRequest()
        contoursRequest.regionOfInterest = boardBoundingBox.visionRect

確定畫面的穩(wěn)定性
之前的普遍做法是骇窍，使用陀螺儀，隔一段時間計算陀螺儀的參數(shù)diff是否大于一個自己的閾值锥余，來判斷用戶或者畫面是否穩(wěn)定「鼓桑現(xiàn)在可以使用蘋果提供的新API - VNTranslationalImageRegistrationRequest，用判斷兩張圖仿射變換xy的距離來判斷畫面的穩(wěn)定性驱犹。

利用VNTranslationalImageRegistrationRequest獲取反射變換transform

    private func checkSceneStability(_ controller: CameraViewController, _ buffer: CMSampleBuffer, _ orientation: CGImagePropertyOrientation) throws {
        guard let previousBuffer = self.previousSampleBuffer else {
            self.previousSampleBuffer = buffer
            return
        }
        let registrationRequest = VNTranslationalImageRegistrationRequest(targetedCMSampleBuffer: buffer)
        try sceneStabilityRequestHandler.perform([registrationRequest], on: previousBuffer, orientation: orientation)
        self.previousSampleBuffer = buffer
        if let alignmentObservation = registrationRequest.results?.first as? VNImageTranslationAlignmentObservation {
            let transform = alignmentObservation.alignmentTransform
            sceneStabilityHistoryPoints.append(CGPoint(x: transform.tx, y: transform.ty))
        }
    }

判斷transform x嘲恍，y，確定畫面是否穩(wěn)定

var sceneStability: SceneStabilityResult {
        // Determine if we have enough evidence of stability.
        guard sceneStabilityHistoryPoints.count > sceneStabilityRequiredHistoryLength else {
            return .unknown
        }
        
        // Calculate the moving average by adding up values of stored points
        // returned by VNTranslationalImageRegistrationRequest for both axis
        var movingAverage = CGPoint.zero
        movingAverage.x = sceneStabilityHistoryPoints.map { $0.x }.reduce(.zero, +)
        movingAverage.y = sceneStabilityHistoryPoints.map { $0.y }.reduce(.zero, +)
        // Get the moving distance by adding absolute moving average values of individual axis
        let distance = abs(movingAverage.x) + abs(movingAverage.y)
        // If the distance is not significant enough to affect the game analysis (less that 10 points),
        // we declare the scene being stable
        return (distance < 10 ? .stable : .unstable)
    }

識別進入畫面的玩家（人體識別）
使用VNDetectHumanBodyPoseRequest來獲取人體和姿勢數(shù)據(jù)雄驹。
3.1 根據(jù)observation.confidence結(jié)果佃牛，判斷是否有人，根據(jù)observation.recognizedPoints医舆，把points的結(jié)果合并吁脱，返回人的CGRect桑涎。
3.2 轉(zhuǎn)換一下recognizedPoints的格式，根據(jù)部分點位兼贡，畫出人的姿態(tài)。

人的姿態(tài)

這個recognizedPoints實際的數(shù)據(jù)是這樣的

left_hand_joint
[0.201963; 0.587417]

然后你需要賽選幾個你想要繪制的點娃胆，使用bezier path將他們鏈接遍希，繪制，然后更新他們的位置里烦。

3.3 姿勢檢測凿蒜，基本原理是通過create ML訓練一個模型來分來。步驟是：獲取一張圖胁黑，獲取圖中人的姿勢結(jié)果废封，也就是通過VNDetectHumanBodyPoseRequest獲取到observation，observation中有姿勢相關(guān)的point丧蘸，將他們構(gòu)造一個MLMultiArray作為輸入漂洋，使用core ML跑模型，來進行姿勢分類力喷。

檢測投擲沙包的拋物線
使用VNDetectTrajectoriesRequest來檢測拋物線軌跡刽漂。因為拋物線是多個點組成的，所以一個VNDetectTrajectoriesRequest實例需要一系列的視頻幀來多次執(zhí)行弟孟，獲取到足夠多的點贝咙，然后執(zhí)行完成回調(diào)。Demo中把這些拋物線的點連成一條曲線拂募，繪制出來庭猩。

拋物線

private lazy var detectTrajectoryRequest: VNDetectTrajectoriesRequest! =
                        VNDetectTrajectoriesRequest(frameAnalysisSpacing: .zero, trajectoryLength: GameConstants.trajectoryLength)
try visionHandler.perform([self.detectTrajectoryRequest])
let results = self.detectTrajectoryRequest.results as? [VNTrajectoryObservation]
let trajectory = UIBezierPath()
        for point in points.dropFirst() {
            trajectory.addLine(to: point.location)
        }
// 然后繪制 draw path

Building a Feature-Rich App for Sports Analysis
丟沙包 - Cornhole