本文由 沈慶陽 所有,轉(zhuǎn)載請與作者取得聯(lián)系!
在繼續(xù)進行之前剂府,我們先來看一下Google Tensorflow Models中的Object Detection API的Github頁面箕憾。
其內(nèi)容主要有:快速上手、環(huán)境搭建邻邮、Object-Detection API的運行和一些額外內(nèi)容尖奔。通過閱讀這些毫蚓,對理解和運用Tensorflow的Object-Detection API具有極大的幫助。
另外尼夺,運用機器學習算法進行研究尊残,其實質(zhì)是尋找目標函數(shù)的過程。通過構(gòu)建機器學習模型(形成函數(shù)集)淤堵,用訓練數(shù)據(jù)做驅(qū)動寝衫,尋找與訓練數(shù)據(jù)匹配,并且在測試數(shù)據(jù)中表現(xiàn)優(yōu)異的函數(shù)拐邪。因此構(gòu)建合適的機器學習模型慰毅,顯得尤為關鍵。
在Tensorflow detection model zoo頁面扎阶,我們可以找到TensorFlow項目維護者們通過COCO汹胃、Kitti和Open Images等訓練數(shù)據(jù)集預先訓練好的模型。
COCO-trained models {#coco-models}
Model name | Speed (ms) | COCO mAP[^1] | Outputs |
---|---|---|---|
ssd_mobilenet_v1_coco | 30 | 21 | Boxes |
ssd_inception_v2_coco | 42 | 24 | Boxes |
faster_rcnn_inception_v2_coco | 58 | 28 | Boxes |
faster_rcnn_resnet50_coco | 89 | 30 | Boxes |
faster_rcnn_resnet50_lowproposals_coco | 64 | Boxes | |
rfcn_resnet101_coco | 92 | 30 | Boxes |
faster_rcnn_resnet101_coco | 106 | 32 | Boxes |
faster_rcnn_resnet101_lowproposals_coco | 82 | Boxes | |
faster_rcnn_inception_resnet_v2_atrous_coco | 620 | 37 | Boxes |
faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco | 241 | Boxes | |
faster_rcnn_nas | 1833 | 43 | Boxes |
faster_rcnn_nas_lowproposals_coco | 540 | Boxes | |
mask_rcnn_inception_resnet_v2_atrous_coco | 771 | 36 | Masks |
mask_rcnn_inception_v2_coco | 79 | 25 | Masks |
mask_rcnn_resnet101_atrous_coco | 470 | 33 | Masks |
mask_rcnn_resnet50_atrous_coco | 343 | 29 | Masks |
由于我們追求實時的檢測速度东臀,所以此處選用速度最快的ssd_mobilenet_v1_coco模型着饥。
編寫訓練的配置文件
使用wget命令或其他下載器下載相應的config配置文件。此處選用的ssd_mobilenet_v1_coco配置文件下載路徑為:
https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_mobilenet_v1_coco.config
打開ssd_mobilenet_v1_coco.config配置文件
# SSD with Mobilenet v1 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.
model {
ssd {
num_classes: 1
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
num_layers: 6
min_scale: 0.2
max_scale: 0.95
aspect_ratios: 1.0
aspect_ratios: 2.0
aspect_ratios: 0.5
aspect_ratios: 3.0
aspect_ratios: 0.3333
}
}
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
}
box_predictor {
convolutional_box_predictor {
min_depth: 0
max_depth: 0
num_layers_before_predictor: 0
use_dropout: false
dropout_keep_probability: 0.8
kernel_size: 1
box_code_size: 4
apply_sigmoid_to_scores: false
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
}
feature_extractor {
type: 'ssd_mobilenet_v1'
min_depth: 16
depth_multiplier: 1.0
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
loss {
classification_loss {
weighted_sigmoid {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
hard_example_miner {
num_hard_examples: 3000
iou_threshold: 0.99
loss_type: CLASSIFICATION
max_negatives_per_positive: 3
min_negatives_per_image: 0
}
classification_weight: 1.0
localization_weight: 1.0
}
normalize_loss_by_num_matches: true
post_processing {
batch_non_max_suppression {
score_threshold: 1e-8
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SIGMOID
}
}
}
train_config: {
batch_size: 24
optimizer {
rms_prop_optimizer: {
learning_rate: {
exponential_decay_learning_rate {
initial_learning_rate: 0.004
decay_steps: 800720
decay_factor: 0.95
}
}
momentum_optimizer_value: 0.9
decay: 0.9
epsilon: 1.0
}
}
fine_tune_checkpoint: "ssd_mobilenet_v1_coco_2017_11_17/model.ckpt"
from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
}
train_input_reader: {
tf_record_input_reader {
input_path: "data/train.record"
}
label_map_path: "training/object-detection.pbtxt"
}
eval_config: {
num_examples: 8000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
}
eval_input_reader: {
tf_record_input_reader {
input_path: "data/test.record"
}
label_map_path: "training/object-detection.pbtxt"
shuffle: false
num_readers: 1
}
修改第9行
num_classes: 1
修改第175行
input_path: "data/train.record"
修改第177行和第191行
label_map_path: "training/object-detection.pbtxt"
修改第189行
input_path: "data/test.record"
第9行指定了我們訓練的目標類目惰赋,由于此處只訓練了一個目標宰掉,所以數(shù)值為1。第175行指定了訓練的輸入的train Record的文件位置赁濒。第177和191行指定了label map的位置轨奄,該文件在下面會創(chuàng)建。第189行指定了測試數(shù)據(jù)的位置拒炎。以上配置按需修改挪拟。
我們來到training目錄下,新建名為object-detection.pbtxt的空白文檔并打開击你。輸入如下內(nèi)容玉组,保存谎柄。
item {
id: 1
name: 'pen'
}
開始訓練
將項目目錄下的data文件夾、images文件夾惯雳、ssd_mobilenet_v1_coco_2017_11_17文件夾谷誓、training文件夾和ssd_mobilenet_v1_coco.config配置文件復制到models/research/object_detection目錄下,病選擇合并吨凑。
打開命令行,并進入models/research目錄下户辱。將Object Detection的庫加入Python變量鸵钝。
# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
注:export的效力僅及于該次登陸操作。
在控制臺進入models/research/object_detection目錄下庐镐,輸入如下命令運行訓練程序
python3 train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_coco.config
當你的終端開始輸出如下內(nèi)容的時候則證明訓練程序正常開始運行了恩商。
INFO:tensorflow:Restoring parameters from training/model.ckpt-1
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
INFO:tensorflow:Starting Queues.
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Recording summary at step 1.
INFO:tensorflow:global step 2: loss = 2.2202 (2.397 sec/step)
INFO:tensorflow:global step 3: loss = 2.5926 (1.749 sec/step)
INFO:tensorflow:global step 4: loss = 1.7984 (1.980 sec/step)
INFO:tensorflow:global step 5: loss = 1.5214 (1.734 sec/step)
INFO:tensorflow:global step 6: loss = 1.7882 (1.147 sec/step)
TensorBoard可視化學習
使用TensorBoard可以將學習的過程可視化。
TensorBoard 涉及到的運算必逆,通常是在訓練龐大的深度神經(jīng)網(wǎng)絡中出現(xiàn)的復雜而又難以理解的運算怠堪。
為了更方便 TensorFlow 程序的理解、調(diào)試與優(yōu)化名眉,Google的Tensorflow發(fā)布了一套叫做 TensorBoard 的可視化工具粟矿。你可以用 TensorBoard 來展現(xiàn)你的 TensorFlow 圖像,繪制圖像生成的定量指標圖以及附加數(shù)據(jù)损拢。
TensorBoard的界面如下:
在TensorFlow程序運行的時候陌粹,你的training文件夾下會出現(xiàn)event文件。重新打開一個終端福压,使用如下命令配置TensorBoard掏秩。
jack@jack~/tensorflowProject/object_detection/models/object_detection$ tensorboard --logdir='training'
運行之后,會出現(xiàn)如下提示
TensorBoard 1.7.0a20180302 at http://127.0.0.1:6006 (Press CTRL+C to quit)
使用瀏覽器打開http://127.0.0.1:6006荆姆,則成功地打開了TensorBoard的界面蒙幻。
當你的TensorBoard中的TotalLoss差不多小于1,且穩(wěn)定的時候就可以停止訓練了胆筒。
此時你的training目錄下將會有看似如下的文件存在邮破。
訓練程序在第375、772腐泻、970等步驟時保存了模型文件决乎。確保你的相應步數(shù)時的模型有.index、.meta和.data-00000-of-00001后綴的文件存在派桩。
再回到training的上級目錄构诚,開啟終端,輸入如下命令。
python export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path training/faster_rcnn_inception_v2_coco.config \
--trained_checkpoint_prefix training/model.ckpt-1167 \
--output_directory pen_graph;
當程序沒有報錯的時候薄霜,證明我們將圖已經(jīng)導出到pen_graph目錄下了。
測試訓練的模型
我們回到object detection目錄下吼肥,使用juypter notebook打開object_detection_tutorial.ipynb文件丑蛤。
修改其第60行
# What model to download.
MODEL_NAME = 'pen_graph'
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('training', 'object-detection.pbtxt')
NUM_CLASSES = 1
修改其第64行
# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(3, 8) ]
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
我們找到object_detection文件夾下的test_images文件夾叠聋。將7張包含鋼筆的圖片重命名為image1-8.jpg∈芄回到Jupyter Notebook碌补,點擊Cell->Run All,并到頁面最下方觀察棉饶。
頁面最下方將會對在test_images文件夾下名為image3-8.jpg的圖片進行目標檢測厦章,并標出其中的鋼筆。
至此照藻,我們已經(jīng)成功訓練好了模型袜啃。可以照著前一篇的文章修改幸缕,使用攝像頭進行目標檢測群发。
覺得寫的不錯的朋友可以點一個 喜歡? ~
謝謝你的支持!