3: Train with customized models and standard datasets用標(biāo)準(zhǔn)數(shù)據(jù)集訓(xùn)練一個(gè)自定義模型
In this note, you will know how to train, test and inference your own customized models under standard datasets.
在這篇文章中屋彪,您將知道如何在標(biāo)準(zhǔn)數(shù)據(jù)集下訓(xùn)練崭歧、測(cè)試和推斷您自己的定制模型。
We use the cityscapes dataset to train a customized Cascade Mask R-CNN R50 model as an example to demonstrate the whole process, which using AugFPN
to replace the defalut FPN
as neck, and add Rotate
or Translate
as training-time auto augmentation.
我們以cityscapes數(shù)據(jù)集訓(xùn)練一個(gè)自定義的級(jí)聯(lián) Mask R-CNN R50模型為例來演示整個(gè)過程,使用' AugFPN '替換默認(rèn)的' FPN '作為頸部,并添加' Rotate '或' Translate '作為訓(xùn)練時(shí)間的自動(dòng)增強(qiáng)。
The basic steps are as below:基本步驟如下:
Prepare the standard dataset 準(zhǔn)備你的標(biāo)注數(shù)據(jù)集
Prepare your own customized model 準(zhǔn)備你的自定義模型
Prepare a config 準(zhǔn)備一個(gè)config
Train, test, and inference models on the standard dataset. 訓(xùn)練、測(cè)試、預(yù)測(cè)模型在標(biāo)準(zhǔn)數(shù)據(jù)集上
Prepare the standard dataset 準(zhǔn)備標(biāo)準(zhǔn)數(shù)據(jù)集
In this note, as we use the standard cityscapes dataset as an example.
在本說明中窑睁,我們使用標(biāo)準(zhǔn)的cityscape數(shù)據(jù)集作為示例。
It is recommended to symlink the dataset root to $MMDETECTION/data
. If your folder structure is different, you may need to change the corresponding paths in config files.
推薦使用符號(hào)鏈接數(shù)據(jù)集根目錄到' $MMDETECTION/data '兼搏。
如果您的文件夾結(jié)構(gòu)不同卵慰,您可能需要更改配置文件中相應(yīng)的路徑。
>mmdetection
├── mmdet
├── tools
├── configs
├── data
│ ├── coco
│ │ ├── annotations
│ │ ├── train2017
│ │ ├── val2017
│ │ ├── test2017
│ ├── cityscapes
│ │ ├── annotations
│ │ ├── leftImg8bit
│ │ │ ├── train
│ │ │ ├── val
│ │ ├── gtFine
│ │ │ ├── train
│ │ │ ├── val
│ ├── VOCdevkit
│ │ ├── VOC2007
│ │ ├── VOC2012
The cityscapes annotations have to be converted into the coco format using
cityscapes標(biāo)注模式轉(zhuǎn)換為coco模式
tools/dataset_converters/cityscapes.py
:
>pip install cityscapesscripts
python tools/dataset_converters/cityscapes.py ./data/cityscapes --nproc 8 --out-dir ./data/cityscapes/annotations
Currently the config files in cityscapes
use COCO pre-trained weights to initialize. You could download the pre-trained models in advance if network is unavailable or slow, otherwise it would cause errors at the beginning of training.
目前佛呻, cityscapes
中的配置文件使用COCO預(yù)先訓(xùn)練的權(quán)重來初始化裳朋。
如果網(wǎng)絡(luò)不通或速度慢,可以提前下載預(yù)訓(xùn)練的模型吓著,否則在開始訓(xùn)練時(shí)就會(huì)出現(xiàn)錯(cuò)誤鲤嫡。
Prepare your own customized model準(zhǔn)備你自己的自定義模型
The second step is to use your own module or training setting. Assume that we want to implement a new neck called AugFPN
to replace with the default FPN
under the existing detector Cascade Mask R-CNN R50. The following implementsAugFPN
under MMDetection.
第二步是使用您自己的模塊或訓(xùn)練設(shè)置。假設(shè)我們想實(shí)現(xiàn)一個(gè)名為“AugFPN”的新頸绑莺,以替換現(xiàn)有檢測(cè)器級(jí)聯(lián)掩碼R-CNN R50下的默認(rèn)“FPN”暖眼。下面在MMDetection下實(shí)現(xiàn)' augfpn '。
1. Define a new neck (e.g. AugFPN)定義一個(gè)新的neck
Firstly create a new file mmdet/models/necks/augfpn.py
.
第一步新創(chuàng)建一個(gè)文件mmdet/models/necks/augfpn.py
.
>from ..builder import NECKS
@NECKS.register_module()
class AugFPN(nn.Module):
def __init__(self,
in_channels,
out_channels,
num_outs,
start_level=0,
end_level=-1,
add_extra_convs=False):
pass
def forward(self, inputs):
# implementation is ignored
pass
2. Import the module導(dǎo)入模塊
You can either add the following line to mmdet/models/necks/__init__.py
,
你可以添加以下行到' mmdet/models/necks/init.py '纺裁,
>from .augfpn import AugFPN
or alternatively add
或者添加
>custom_imports = dict(
imports=['mmdet.models.necks.augfpn.py'],
allow_failed_imports=False)
to the config file and avoid modifying the original code.
到配置文件诫肠,并避免修改原始代碼。
3. Modify the config file 修改config文件
>neck=dict(
type='AugFPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5)
For more detailed usages about customize your own models (e.g. implement a new backbone, head, loss, etc) and runtime training settings (e.g. define a new optimizer, use gradient clip, customize training schedules and hooks, etc), please refer to the guideline Customize Models and Customize Runtime Settings respectively.
關(guān)于自定義您自己的模型(例如欺缘,實(shí)現(xiàn)一個(gè)新的骨干栋豫,頭部,損失等)和運(yùn)行時(shí)訓(xùn)練設(shè)置(例如谚殊,定義一個(gè)新的優(yōu)化器丧鸯,使用梯度剪輯,自定義訓(xùn)練時(shí)間表和鉤子等)的更詳細(xì)的用法嫩絮,請(qǐng)參考指南定制模型和定制運(yùn)行時(shí)設(shè)置丛肢。
Prepare a config
The third step is to prepare a config for your own training setting. Assume that we want to add AugFPN
and Rotate
or Translate
augmentation to existing Cascade Mask R-CNN R50 to train the cityscapes dataset, and assume the config is under directory configs/cityscapes/
and named as cascade_mask_rcnn_r50_augfpn_autoaug_10e_cityscapes.py
, the config is as below.
第三步是為您自己的訓(xùn)練設(shè)置準(zhǔn)備配置。假設(shè)我們想要添加“AugFPN”和“旋轉(zhuǎn)”或“翻譯”增加到現(xiàn)有的Mask R-CNN R50 cityscapes 數(shù)據(jù)集訓(xùn)練,和假設(shè)配置是在目錄configs/cityscapes/
和“cascade_mask_rcnn_r50_augfpn_autoaug_10e_cityscapes.py”,下面的配置一樣剿干。
># The new config inherits the base configs to highlight the necessary modification
_base_ = [
'../_base_/models/cascade_mask_rcnn_r50_fpn.py',
'../_base_/datasets/cityscapes_instance.py', '../_base_/default_runtime.py'
]
model = dict(
# set None to avoid loading ImageNet pretrained backbone,
# instead here we set `load_from` to load from COCO pretrained detectors.
pretrained=None,
# replace neck from defaultly `FPN` to our new implemented module `AugFPN`
neck=dict(
type='AugFPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
# We also need to change the num_classes in head from 80 to 8, to match the
# cityscapes dataset's annotation. This modification involves `bbox_head` and `mask_head`.
roi_head=dict(
bbox_head=[
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
# change the number of classes from defaultly COCO to cityscapes
num_classes=8,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
# change the number of classes from defaultly COCO to cityscapes
num_classes=8,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
loss_weight=1.0)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
# change the number of classes from defaultly COCO to cityscapes
num_classes=8,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.033, 0.033, 0.067, 0.067]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
],
mask_head=dict(
type='FCNMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
# change the number of classes from defaultly COCO to cityscapes
num_classes=8,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))))
# over-write `train_pipeline` for new added `AutoAugment` training setting
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(
type='AutoAugment',
policies=[
[dict(
type='Rotate',
level=5,
img_fill_val=(124, 116, 104),
prob=0.5,
scale=1)
],
[dict(type='Rotate', level=7, img_fill_val=(124, 116, 104)),
dict(
type='Translate',
level=5,
prob=0.5,
img_fill_val=(124, 116, 104))
],
]),
dict(
type='Resize', img_scale=[(2048, 800), (2048, 1024)], keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
]
# set batch_size per gpu, and set new training pipeline
data = dict(
samples_per_gpu=1,
workers_per_gpu=3,
# over-write `pipeline` with new training pipeline setting
train=dict(dataset=dict(pipeline=train_pipeline)))
# Set optimizer
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# Set customized learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[8])
total_epochs = 10
# We can use the COCO pretrained Cascade Mask R-CNN R50 model for more stable performance initialization
load_from = 'http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco/cascade_mask_rcnn_r50_fpn_1x_coco_20200203-9d4dcb24.pth'
Train a new model 訓(xùn)練一個(gè)新模型
To train a model with the new config, you can simply run要用新的配置訓(xùn)練模型蜂怎,您可以簡(jiǎn)單地運(yùn)行
>python tools/train.py configs/cityscapes/cascade_mask_rcnn_r50_augfpn_autoaug_10e_cityscapes.py
For more detailed usages, please refer to the Case 1.
Test and inference測(cè)試和推斷
To test the trained model, you can simply run要測(cè)試訓(xùn)練過的模型,您可以簡(jiǎn)單地運(yùn)行
>python tools/test.py configs/cityscapes/cascade_mask_rcnn_r50_augfpn_autoaug_10e_cityscapes.py work_dirs/cascade_mask_rcnn_r50_augfpn_autoaug_10e_cityscapes.py/latest.pth --eval bbox segm
For more detailed usages, please refer to the Case 1.