從零開始機器學習-4 教會你的AI識別特定的物體（下）

本文由沈慶陽所有,轉(zhuǎn)載請與作者取得聯(lián)系!
在繼續(xù)進行之前剂府，我們先來看一下Google Tensorflow Models中的Object Detection API的Github頁面箕憾。
其內(nèi)容主要有：快速上手、環(huán)境搭建邻邮、Object-Detection API的運行和一些額外內(nèi)容尖奔。通過閱讀這些毫蚓，對理解和運用Tensorflow的Object-Detection API具有極大的幫助。
另外尼夺，運用機器學習算法進行研究尊残，其實質(zhì)是尋找目標函數(shù)的過程。通過構(gòu)建機器學習模型（形成函數(shù)集）淤堵，用訓練數(shù)據(jù)做驅(qū)動寝衫，尋找與訓練數(shù)據(jù)匹配，并且在測試數(shù)據(jù)中表現(xiàn)優(yōu)異的函數(shù)拐邪。因此構(gòu)建合適的機器學習模型慰毅，顯得尤為關鍵。
在Tensorflow detection model zoo頁面扎阶，我們可以找到TensorFlow項目維護者們通過COCO汹胃、Kitti和Open Images等訓練數(shù)據(jù)集預先訓練好的模型。

COCO-trained models {#coco-models}

Model name	Speed (ms)	COCO mAP[^1]	Outputs
ssd_mobilenet_v1_coco	30	21	Boxes
ssd_inception_v2_coco	42	24	Boxes
faster_rcnn_inception_v2_coco	58	28	Boxes
faster_rcnn_resnet50_coco	89	30	Boxes
faster_rcnn_resnet50_lowproposals_coco	64		Boxes
rfcn_resnet101_coco	92	30	Boxes
faster_rcnn_resnet101_coco	106	32	Boxes
faster_rcnn_resnet101_lowproposals_coco	82		Boxes
faster_rcnn_inception_resnet_v2_atrous_coco	620	37	Boxes
faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco	241		Boxes
faster_rcnn_nas	1833	43	Boxes
faster_rcnn_nas_lowproposals_coco	540		Boxes
mask_rcnn_inception_resnet_v2_atrous_coco	771	36	Masks
mask_rcnn_inception_v2_coco	79	25	Masks
mask_rcnn_resnet101_atrous_coco	470	33	Masks
mask_rcnn_resnet50_atrous_coco	343	29	Masks

由于我們追求實時的檢測速度东臀，所以此處選用速度最快的ssd_mobilenet_v1_coco模型着饥。

編寫訓練的配置文件

使用wget命令或其他下載器下載相應的config配置文件。此處選用的ssd_mobilenet_v1_coco配置文件下載路徑為：
https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_coco.config
打開ssd_mobilenet_v1_coco.config配置文件

# SSD with Mobilenet v1 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 1
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v1'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "ssd_mobilenet_v1_coco_2017_11_17/model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "data/train.record"
  }
  label_map_path: "training/object-detection.pbtxt"
}

eval_config: {
  num_examples: 8000
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "data/test.record"
  }
  label_map_path: "training/object-detection.pbtxt"
  shuffle: false
  num_readers: 1
}

修改第9行

num_classes: 1

修改第175行

input_path: "data/train.record"

修改第177行和第191行

label_map_path: "training/object-detection.pbtxt"

修改第189行

input_path: "data/test.record"

第9行指定了我們訓練的目標類目惰赋，由于此處只訓練了一個目標宰掉，所以數(shù)值為1。第175行指定了訓練的輸入的train Record的文件位置赁濒。第177和191行指定了label map的位置轨奄，該文件在下面會創(chuàng)建。第189行指定了測試數(shù)據(jù)的位置拒炎。以上配置按需修改挪拟。
我們來到training目錄下，新建名為object-detection.pbtxt的空白文檔并打開击你。輸入如下內(nèi)容玉组，保存谎柄。

item {
  id: 1
  name: 'pen'
}

開始訓練

將項目目錄下的data文件夾、images文件夾惯雳、ssd_mobilenet_v1_coco_2017_11_17文件夾谷誓、training文件夾和ssd_mobilenet_v1_coco.config配置文件復制到models/research/object_detection目錄下，病選擇合并吨凑。
打開命令行，并進入models/research目錄下户辱。將Object Detection的庫加入Python變量鸵钝。

# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

注：export的效力僅及于該次登陸操作。
在控制臺進入models/research/object_detection目錄下庐镐，輸入如下命令運行訓練程序

python3 train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_coco.config

當你的終端開始輸出如下內(nèi)容的時候則證明訓練程序正常開始運行了恩商。

INFO:tensorflow:Restoring parameters from training/model.ckpt-1
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
INFO:tensorflow:Starting Queues.
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Recording summary at step 1.
INFO:tensorflow:global step 2: loss = 2.2202 (2.397 sec/step)
INFO:tensorflow:global step 3: loss = 2.5926 (1.749 sec/step)
INFO:tensorflow:global step 4: loss = 1.7984 (1.980 sec/step)
INFO:tensorflow:global step 5: loss = 1.5214 (1.734 sec/step)
INFO:tensorflow:global step 6: loss = 1.7882 (1.147 sec/step)

TensorBoard可視化學習

使用TensorBoard可以將學習的過程可視化。
TensorBoard 涉及到的運算必逆，通常是在訓練龐大的深度神經(jīng)網(wǎng)絡中出現(xiàn)的復雜而又難以理解的運算怠堪。
為了更方便 TensorFlow 程序的理解、調(diào)試與優(yōu)化名眉，Google的Tensorflow發(fā)布了一套叫做 TensorBoard 的可視化工具粟矿。你可以用 TensorBoard 來展現(xiàn)你的 TensorFlow 圖像，繪制圖像生成的定量指標圖以及附加數(shù)據(jù)损拢。
TensorBoard的界面如下：

TensorBoard界面

在TensorFlow程序運行的時候陌粹，你的training文件夾下會出現(xiàn)event文件。重新打開一個終端福压，使用如下命令配置TensorBoard掏秩。

jack@jack~/tensorflowProject/object_detection/models/object_detection$ tensorboard --logdir='training'

運行之后，會出現(xiàn)如下提示

TensorBoard 1.7.0a20180302 at http://127.0.0.1:6006 (Press CTRL+C to quit)

使用瀏覽器打開http://127.0.0.1:6006荆姆，則成功地打開了TensorBoard的界面蒙幻。
當你的TensorBoard中的TotalLoss差不多小于1，且穩(wěn)定的時候就可以停止訓練了胆筒。
此時你的training目錄下將會有看似如下的文件存在邮破。

訓練生成的模型checkpoint

訓練程序在第375、772腐泻、970等步驟時保存了模型文件决乎。確保你的相應步數(shù)時的模型有.index、.meta和.data-00000-of-00001后綴的文件存在派桩。
再回到training的上級目錄构诚，開啟終端，輸入如下命令。

python export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path training/faster_rcnn_inception_v2_coco.config \
    --trained_checkpoint_prefix training/model.ckpt-1167 \
    --output_directory pen_graph;

當程序沒有報錯的時候薄霜，證明我們將圖已經(jīng)導出到pen_graph目錄下了。

導出的訓練好的圖

測試訓練的模型

我們回到object detection目錄下吼肥，使用juypter notebook打開object_detection_tutorial.ipynb文件丑蛤。
修改其第60行

# What model to download.
MODEL_NAME = 'pen_graph'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('training', 'object-detection.pbtxt')

NUM_CLASSES = 1

修改其第64行

# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(3, 8) ]

# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)

我們找到object_detection文件夾下的test_images文件夾叠聋。將7張包含鋼筆的圖片重命名為image1-8.jpg∈芄回到Jupyter Notebook碌补，點擊Cell->Run All，并到頁面最下方觀察棉饶。
頁面最下方將會對在test_images文件夾下名為image3-8.jpg的圖片進行目標檢測厦章，并標出其中的鋼筆。

對訓練好的模型的測試

至此照藻，我們已經(jīng)成功訓練好了模型袜啃。可以照著前一篇的文章修改幸缕，使用攝像頭進行目標檢測群发。

覺得寫的不錯的朋友可以點一個喜歡? ~
謝謝你的支持！

最后編輯于：2018.05.28 16:31:12

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末发乔，一起剝皮案震驚了整個濱河市熟妓，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌栏尚，老刑警劉巖滑蚯，帶你破解...
沈念sama閱讀 211,265評論 6贊 490
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異抵栈，居然都是意外死亡告材，警方通過查閱死者的電腦和手機，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 90,078評論 2贊 385
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門古劲，熙熙樓的掌柜王于貴愁眉苦臉地迎上來斥赋，“玉大人，你說我怎么就攤上這事产艾“探＃” “怎么了？”我有些...
開封第一講書人閱讀 156,852評論 0贊 347
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵闷堡，是天一觀的道長隘膘。經(jīng)常有香客問我，道長杠览，這世上最難降的妖魔是什么弯菊？我笑而不...
開封第一講書人閱讀 56,408評論 1贊 283
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮踱阿，結(jié)果婚禮上管钳，老公的妹妹穿的比我還像新娘钦铁。我一直安慰自己，他們只是感情好才漆，可當我...
茶點故事閱讀 65,445評論 5贊 384
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布牛曹。她就那樣靜靜地躺著，像睡著了一般醇滥。火紅的嫁衣襯著肌膚如雪黎比。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 49,772評論 1贊 290
城市分裂傳說
那天鸳玩，我揣著相機與錄音焰手，去河邊找鬼。笑死怀喉，一個胖子當著我的面吹牛，可吹牛的內(nèi)容都是我干的船响。我是一名探鬼主播躬拢，決...
沈念sama閱讀 38,921評論 3贊 406
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼见间！你這毒婦竟也來了聊闯？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 37,688評論 0贊 266
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤米诉，失蹤者是張志新（化名）和其女友劉穎菱蔬，沒想到半個月后，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體史侣，經(jīng)...
沈念sama閱讀 44,130評論 1贊 303
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡拴泌，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 36,467評論 2贊 325
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現(xiàn)自己被綠了惊橱。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片蚪腐。...
茶點故事閱讀 38,617評論 1贊 340
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖税朴，靈堂內(nèi)的尸體忽然破棺而出回季，到底是詐尸還是另有隱情，我是刑警寧澤正林，帶...
沈念sama閱讀 34,276評論 4贊 329
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布泡一，位于F島的核電站，受9級特大地震影響觅廓，放射性物質(zhì)發(fā)生泄漏鼻忠。R本人自食惡果不足惜，卻給世界環(huán)境...
茶點故事閱讀 39,882評論 3贊 312
男人毒藥：我在死后第九天來索命
文/蒙蒙一杈绸、第九天我趴在偏房一處隱蔽的房頂上張望粥烁。院中可真熱鬧贤笆，春花似錦、人聲如沸讨阻。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,740評論 0贊 21
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽钝吮。三九已至埋涧，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間奇瘦，已是汗流浹背棘催。一陣腳步聲響...
開封第一講書人閱讀 31,967評論 1贊 265
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留耳标，地道東北人醇坝。一個月前我還...
沈念sama閱讀 46,315評論 2贊 360
代替公主和親
正文我出身青樓，卻偏偏與公主長得像次坡，于是被迫代替她去往敵國和親呼猪。傳聞我的和親對象是個殘疾皇子，可洞房花燭夜當晚...
茶點故事閱讀 43,486評論 2贊 348