py-faster-rcnn---output_alt_opt

1、output_alt_opt

faster_rcnn_alt_opt.sh
train_faster_rcnn_alt_opt.py

Stage 1 RPN, init from ImageNet model

RPN訓(xùn)練過程:
train_rpn中:

cfg.TRAIN.PROPOSAL_METHOD = 'gt'模式設(shè)定,之后會調(diào)用pascal_voc.py中g(shù)t_roidb
cfg.TRAIN.IMS_PER_BATCH = 1
get_roidb準(zhǔn)備roidb, imdb
train_net訓(xùn)練RPN

get_roidb中:

imdb = get_imdb(imdb_name)初始化imdb類宛官,調(diào)用factory.py和pascal.py
訓(xùn)練RPN時(shí)rpn_file=None餐曼,只有g(shù)round truth的框
roidb = get_training_roidb(imdb)調(diào)用train.py中g(shù)et_training_roidb函數(shù)邪蛔,得到roidb

get_training_roidb中:

imdb.append_flipped_images()(imdb.py中)水平翻轉(zhuǎn)可以看作一種數(shù)據(jù)擴(kuò)充方式王暗,將gt_roidb函數(shù)(pascal_voc.py)中得到的roidb[i]['boxes']翻轉(zhuǎn),圖像索引加倍
rdl_roidb.prepare_roidb(imdb)(roidb.py中)得到roidb[i]['max_classes']惠奸,roidb[i]['max_overlaps'],roidb[i]['image']恰梢,roidb[i]['width']佛南,roidb[i]['height']

gt_roidb中:

解析標(biāo)注的xml文件(ground truth)/data/VOCdevkit2007/VOC2007/Annotations得到gt_roidb
gt_roidb包括:
{'boxes' : boxes,
'gt_classes': gt_classes,
'gt_overlaps' : overlaps,
'flipped' : False,
'seg_areas' : seg_areas}

gt_roidb中維度如下(函數(shù)_load_pascal_annotation)

boxes = np.zeros((num_objs, 4), dtype=np.uint16)
gt_classes = np.zeros((num_objs), dtype=np.int32)
overlaps = np.zeros((num_objs, self.num_classes), dtype=np.float32)
# "Seg" area for pascal is just the box area
seg_areas = np.zeros((num_objs), dtype=np.float32)
# Load object bounding boxes into a data frame.
for ix, obj in enumerate(objs):
     bbox = obj.find('bndbox')
      # Make pixel indexes 0-based
     x1 = float(bbox.find('xmin').text) - 1
     y1 = float(bbox.find('ymin').text) - 1
     x2 = float(bbox.find('xmax').text) - 1
     y2 = float(bbox.find('ymax').text) - 1
     cls = self._class_to_ind[obj.find('name').text.lower().strip()]
     boxes[ix, :] = [x1, y1, x2, y2]
     gt_classes[ix] = cls
     overlaps[ix, cls] = 1.0
     seg_areas[ix] = (x2 - x1 + 1) * (y2 - y1 + 1)

train_rpn中g(shù)et_roidb之后調(diào)用train_net訓(xùn)練:
model_paths = train_net(solver, roidb, output_dir,pretrained_model=init_model,max_iters=max_iters)
tran.py中train_net:

roidb = filter_roidb(roidb)圖片不滿足至少有一個(gè)前景或至少有一個(gè)背景的條件,overlaps:0-0.5背景0.5-1前景嵌言,RPN訓(xùn)練時(shí)都是ground-truth嗅回,overlaps都是1,所以該函數(shù)無用
sw = SolverWrapper(solver_prototxt, roidb, output_dir,pretrained_model=pretrained_model)加載預(yù)訓(xùn)練模型和solver_prototxt等

stage1_rpn_train.pt

name: "ZF"
layer {
  name: 'input-data'
  type: 'Python'
  top: 'data'
  top: 'im_info'
  top: 'gt_boxes'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 21"
  }
}

調(diào)用roi_data_layer.layer層摧茴,該層就是調(diào)用程序minibatch.py中g(shù)et_minibatch函數(shù)绵载。
get_minibatch

rois_per_image:每個(gè)圖像最多包含的boxes個(gè)數(shù),這里取128/2=64 //use RPN這個(gè)參數(shù)沒有
fg_rois_per_image:rois_per_image的0.25,前景16個(gè) //use RPN這個(gè)參數(shù)沒有
im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)得到縮放系數(shù)娃豹,原圖和網(wǎng)絡(luò)輸入圖像的比例焚虱,im_blob網(wǎng)絡(luò)的輸入blob
if cfg.TRAIN.HAS_RPN:訓(xùn)練RPN的時(shí)候
gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]正樣本
gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)
gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]
gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]
blobs['gt_boxes'] = gt_boxes5列,前4列坐標(biāo)懂版,最后一列是類別鹃栽,行數(shù)是正樣本的數(shù)
blobs['im_info'] = np.array([[im_blob.shape[2], im_blob.shape[3], im_scales[0]]],dtype=np.float32)im_scale = float(target_size) / float(im_size_min) = 600/min(P,Q);(im_blob.shape[2], im_blob.shape[3]) = (M,N)躯畴;min(M,N) = 600
函數(shù)返回blob

layer {
  name: 'rpn-data'
  type: 'Python'
  bottom: 'rpn_cls_score'
  bottom: 'gt_boxes'
  bottom: 'im_info'
  bottom: 'data'
  top: 'rpn_labels'
  top: 'rpn_bbox_targets'
  top: 'rpn_bbox_inside_weights'
  top: 'rpn_bbox_outside_weights'
  python_param {
    module: 'rpn.anchor_target_layer'
    layer: 'AnchorTargetLayer'
    param_str: "'feat_stride': 16"
  }
}

generate_anchors(base_size=16, ratios=[0.5, 1, 2], scales=2**np.arange(3, 6)):

在網(wǎng)絡(luò)開始就得到了9個(gè)anchor的大小定義民鼓,這9個(gè)是feature_map第一個(gè)cell的anchor

def generate_anchors(base_size=16, ratios=[0.5, 1, 2],
                     scales=2**np.arange(3, 6)):
    """
    Generate anchor (reference) windows by enumerating aspect ratios X
    scales wrt a reference (0, 0, 15, 15) window.
    """

    base_anchor = np.array([1, 1, base_size, base_size]) - 1  # [0, 0, 15, 15]
    ratio_anchors = _ratio_enum(base_anchor, ratios) 
'''[[ -3.5,   2. ,  18.5,  13. ],
    [  0. ,   0. ,  15. ,  15. ],
    [  2.5,  -3. ,  12.5,  18. ]]'''
    anchors = np.vstack([_scale_enum(ratio_anchors[i, :], scales)
                         for i in xrange(ratio_anchors.shape[0])])
'''
[[ -84.  -40.   99.   55.]
 [-176.  -88.  191.  103.]
 [-360. -184.  375.  199.]
 [ -56.  -56.   71.   71.]
 [-120. -120.  135.  135.]
 [-248. -248.  263.  263.]
 [ -36.  -80.   51.   95.]
 [ -80. -168.   95.  183.]
 [-168. -344.  183.  359.]]
'''
    return anchors

anchor_target_layer.py:

生成每個(gè)錨點(diǎn)的訓(xùn)練目標(biāo)和標(biāo)簽,將其分類為1 (object)蓬抄,0 (not object) 丰嘉, -1 (ignore)。當(dāng)label>0倡鲸,也就是有object時(shí)供嚎,將會進(jìn)行box的回歸。
forward函數(shù):在每一個(gè)cell中峭状,生成9個(gè)錨點(diǎn)克滴,提供這9個(gè)錨點(diǎn)的細(xì)節(jié)信息,過濾掉超過圖像的錨點(diǎn)优床,測量同GT的overlap劝赔。
1、產(chǎn)生proposal胆敞,A個(gè)anchors着帽,K個(gè)shifts,這里A=9移层,K=H*W仍翰,W、H代表featuremap的長寬观话,一張圖中均勻地取了61 x 36個(gè)點(diǎn)予借,shift_x和shift_y分別是這些點(diǎn)在圖中的偏移位置,通過對九個(gè)anchor坐標(biāo)偏移可以使feature_map的每個(gè)cell都有9個(gè)anchor频蛔。H x feat_stride以及W x feat_stride正好約等于rescale以后的每張圖的大小灵迫,feat_stride=16
2、去除超過圖像邊界的anchors(裁減掉了2/3左右)
3晦溪、labels全-1
4瀑粥、anchors和gt_boxes的overlaps
5、labels[max_overlaps < cfg.TRAIN.RPN_NEGATIVE_OVERLAP] = 0 ==>0.3
labels[max_overlaps >= cfg.TRAIN.RPN_POSITIVE_OVERLAP] = 1 ==>0.7
6三圆、對正樣本和負(fù)樣本采樣(正樣本與負(fù)樣本保持1:1)
num_fg = int(cfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZE)num_fg不大于0.5*256

num_bg = cfg.TRAIN.RPN_BATCHSIZE - np.sum(labels == 1)負(fù)樣本數(shù)目
7狞换、得到bbox_targets((len(inds_inside), 4):負(fù)樣本是全是0避咆、bbox_inside_weights((len(inds_inside), 4)z正樣本四個(gè)數(shù)都賦值為cfg.TRAIN.RPN_BBOX_INSIDE_WEIGHTS=1、bbox_outside_weights((len(inds_inside), 4)如下:

    if cfg.TRAIN.RPN_POSITIVE_WEIGHT < 0:#(-1)
        # uniform weighting of examples (given non-uniform sampling)
        num_examples = np.sum(labels >= 0)
        positive_weights = np.ones((1, 4)) * 1.0 / num_examples
        negative_weights = np.ones((1, 4)) * 1.0 / num_examples
    bbox_outside_weights[labels == 1, :] = positive_weights
    bbox_outside_weights[labels == 0, :] = negative_weights

8哀澈、_unmap:all_anchors裁減掉了2/3左右牌借,僅僅保留在圖像內(nèi)的anchor,這里就是將其復(fù)原作為下一層的輸入了割按,并reshape成相應(yīng)的格式


Stage 1 RPN, generate proposals

rpn_generate中:

cfg.TEST.RPN_PRE_NMS_TOP_N = -1 # no pre NMS filtering
cfg.TEST.RPN_POST_NMS_TOP_N = 2000最后得到的proposal不超過2000
rpn_net = caffe.Net(rpn_test_prototxt, rpn_model_path, caffe.TEST)使用上面rpn訓(xùn)練的模型膨报,prototxt:rpn_test.pt進(jìn)行測試得到proposal

rpn_test.pt中:

layer {
  name: 'proposal'
  type: 'Python'
  bottom: 'rpn_cls_prob_reshape'
  bottom: 'rpn_bbox_pred'
  bottom: 'im_info'
  top: 'rois'
  top: 'scores'
  python_param {
    module: 'rpn.proposal_layer'
    layer: 'ProposalLayer'
    param_str: "'feat_stride': 16"
  }
}

rpn.proposal_layer-->proposal_layer.py:這個(gè)函數(shù)是用來將RPN的輸出轉(zhuǎn)變?yōu)閛bject proposals的。作者新增了ProposalLayer類适荣,這個(gè)類中现柠,重新了set_up和forward函數(shù)

forward:
生成錨點(diǎn)box、對于每個(gè)錨點(diǎn)提供box的參數(shù)細(xì)節(jié)
將預(yù)測框切成圖像弛矛,刪除寬够吩、高小于閾值(16 * im_info[2])的框
將所有的(proposal, score) 對排序
獲取 pre_nms_topN proposals(這里不執(zhí)行,因?yàn)閏fg.TEST.RPN_PRE_NMS_TOP_N = -1)
獲取NMS(閾值0.7)
獲取 after_nms_topN proposals丈氓,這里cfg.TEST.RPN_POST_NMS_TOP_N = 2000周循,取前2000個(gè)(原來沒有2000個(gè)可能出錯(cuò))


Stage 1 Fast R-CNN using RPN proposals, init from ImageNet model

train_fast_rcnn中:

cfg.TRAIN.PROPOSAL_METHOD = 'rpn'將調(diào)用pascal_voc.py中rpn_roidb
cfg.TRAIN.IMS_PER_BATCH = 2
get_roidb準(zhǔn)備roidb, imdb,train_net訓(xùn)練RPN

get_roidb中:

imdb = get_imdb(imdb_name)初始化imdb類万俗,調(diào)用factory.py和pascal.py
訓(xùn)練RPN時(shí)rpn_file=None湾笛,之后ground truth的框
roidb = get_training_roidb(imdb)調(diào)用train.py中g(shù)et_training_roidb函數(shù),得到roidb

get_training_roidb中:

imdb.append_flipped_images()(imdb.py中)水平翻轉(zhuǎn)可以看作一種數(shù)據(jù)擴(kuò)充方式闰歪,將rpn_roidb函數(shù)(pascal_voc.py)中得到的roidb[i]['boxes']翻轉(zhuǎn)嚎研,圖像索引加倍
rdl_roidb.prepare_roidb(imdb)(roidb.py中)得到roidb[i]['max_classes'],roidb[i]['max_overlaps']库倘,roidb[i]['image']临扮,roidb[i]['width'],roidb[i]['height']

rpn_roidb中:

gt_roidb = self.gt_roidb()首先得到gt_roidb教翩,ground truth
rpn_roidb = self._load_rpn_roidb(gt_roidb)得到從rpn_file得到rpn_roidb杆勇, 'gt_overlaps' 是和gt_roidb的最大重疊值,num_classes列只有一列有值其余為0
roidb = imdb.merge_roidbs(gt_roidb, rpn_roidb)疊加到一起

train_fast_rcnn中g(shù)et_roidb之后調(diào)用train_net訓(xùn)練:
model_paths = train_net(solver, roidb, output_dir,pretrained_model=init_model,max_iters=max_iters)
tran.py中train_net:

roidb = filter_roidb(roidb)圖片不滿足至少有一個(gè)前景或至少有一個(gè)背景的條件饱亿,overlaps:0-0.5背景0.5-1前景
sw = SolverWrapper(solver_prototxt, roidb, output_dir,pretrained_model=pretrained_model)加載預(yù)訓(xùn)練模型靶橱,以及加載回歸框參數(shù)rdl_roidb.add_bbox_regression_targets(roidb)

roidb.py中add_bbox_regression_targets:

roidb[im_i]['bbox_targets'] = _compute_targets(rois, max_overlaps, max_classes)
(均值方差計(jì)算)未知

_compute_targets:

gt_inds:ground-truth ROIs
ex_inds:fg ROIs(這里判斷overlaps閾值大于0.5,所以包含了gt_inds路捧,因?yàn)間round-truth的overlaps=1)
ex_gt_overlaps:ex ROI和gt ROI的overlaps,返回num_ex * num_gt
gt_assignment:每個(gè)ex ROI和gt ROI overlaps最大的gt ROI索引
gt_rois传黄、ex_rois:gt_inds和ex_inds對應(yīng)的box

targets:rois.shape[0]* 5杰扫,第一列是labels,之后ex_inds處寫label膘掰,其他行是0章姓。后四列是4個(gè)方位偏移佳遣,也是ex_inds處才寫,之前說ex_inds包含gt_inds凡伊,但偏移量是0零渐,所以不進(jìn)行回歸

def _compute_targets(rois, overlaps, labels):
    """Compute bounding-box regression targets for an image."""
    # Indices of ground-truth ROIs
    gt_inds = np.where(overlaps == 1)[0]
    if len(gt_inds) == 0:
        # Bail if the image has no ground-truth ROIs
        return np.zeros((rois.shape[0], 5), dtype=np.float32)
    # Indices of examples for which we try to make predictions
    ex_inds = np.where(overlaps >= cfg.TRAIN.BBOX_THRESH)[0]

    # Get IoU overlap between each ex ROI and gt ROI
    ex_gt_overlaps = bbox_overlaps(
        np.ascontiguousarray(rois[ex_inds, :], dtype=np.float),
        np.ascontiguousarray(rois[gt_inds, :], dtype=np.float))

    # Find which gt ROI each ex ROI has max overlap with:
    # this will be the ex ROI's gt target
    gt_assignment = ex_gt_overlaps.argmax(axis=1)
    gt_rois = rois[gt_inds[gt_assignment], :]
    ex_rois = rois[ex_inds, :]

    targets = np.zeros((rois.shape[0], 5), dtype=np.float32)
    targets[ex_inds, 0] = labels[ex_inds]
    targets[ex_inds, 1:] = bbox_transform(ex_rois, gt_rois)
    return targets

stage1_fast_rcnn_train.pt文件中:

name: "ZF"
layer {
  name: 'data'
  type: 'Python'
  top: 'data'
  top: 'rois'
  top: 'labels'
  top: 'bbox_targets'
  top: 'bbox_inside_weights'
  top: 'bbox_outside_weights'
  python_param {
    module: 'roi_data_layer.layer'
    layer: 'RoIDataLayer'
    param_str: "'num_classes': 21"
  }
}

調(diào)用roi_data_layer.layer層,該層就是調(diào)用程序minibatch.py中g(shù)et_minibatch函數(shù)系忙。(end2end方法中ProposalTargetLayer層起相同作用)
get_minibatch

rois_per_image:每個(gè)圖像最多包含的boxes個(gè)數(shù)诵盼,這里取128/1=128
fg_rois_per_image:rois_per_image的0.25,前景32個(gè)(正負(fù)樣本比:1:3)
im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)得到縮放系數(shù)银还,原圖和網(wǎng)絡(luò)輸入圖像的比例风宁,im_blob網(wǎng)絡(luò)的輸入blob
if cfg.TRAIN.HAS_RPN:訓(xùn)練RPN的時(shí)候
else:訓(xùn)練fast-rcnn
主要調(diào)用_sample_rois函數(shù)得到labels, overlaps, im_rois, bbox_targets, bbox_inside_weights,roidb中坐標(biāo)都是對應(yīng)原圖(P*Q)蛹疯,所以im_rois也是
rois = _project_im_rois(im_rois, im_scales[im_i])得到符合網(wǎng)絡(luò)輸入的roi (M*N)
rois_blob:5列的二維數(shù)組戒财,第一列代表這個(gè)box是batch中第幾個(gè)圖像,后四列是坐標(biāo)捺弦,所有batch的rois都疊加在一起饮寞,賦值給blobs['rois']
labels_blob,bbox_targets_blob,bbox_inside_blob也是batch疊加在一起,賦值給blobs['labels'],blobs['bbox_targets'],blobs['bbox_inside_weights'],blobs['bbox_outside_weights']
blobs['bbox_outside_weights'] = np.array(bbox_inside_blob > 0).astype(np.float32)bbox_outside_weights計(jì)算方法(用途未知)
函數(shù)返回blob

_sample_rois

fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]大于0.5為前景(有一個(gè)問題是這個(gè)閾值會篩選出ground-truth列吼,overlaps=1幽崩,之后隨機(jī)選取正樣本也可能選出gt,對于gt回歸的偏移量為0)
fg_rois_per_this_image = np.minimum(fg_rois_per_image, fg_inds.size)取32和前景數(shù)最小的值
fg_inds = npr.choice(fg_inds, size=fg_rois_per_this_image, replace=False)隨機(jī)取fg_rois_per_this_image個(gè)前景
bg_rois_per_this_image:背景個(gè)數(shù)冈欢,64-fg_rois_per_this_image和overlaps小于0.5的最小值
bg_inds:隨機(jī)取的背景數(shù)的索引
keep_inds = np.append(fg_inds, bg_inds)前景和背景索引疊加
labels = labels[keep_inds]得到labels
labels[fg_rois_per_this_image:] = 0背景的labels=0
overlaps = overlaps[keep_inds]overlaps
rois = rois[keep_inds]rois
bbox_targets, bbox_inside_weights = _get_bbox_regression_labels(roidb['bbox_targets'][keep_inds, :], num_classes)bbox_targets維度keep_inds*84歉铝,(84 = 4 * num_classes),由原來的5列變成84列凑耻,將其中的前景行對應(yīng)類別的偏移量賦值(4個(gè)參數(shù))太示,其他列是0;bbox_inside_weights與bbox_targets維度相同香浩,bbox_targets賦值四個(gè)偏移量的位置类缤,bbox_inside_weights全賦值為1

"""Compute minibatch blobs for training a Fast R-CNN network."""
def get_minibatch(roidb, num_classes):
    """Given a roidb, construct a minibatch sampled from it."""
    num_images = len(roidb)
    # Sample random scales to use for each image in this batch
    random_scale_inds = npr.randint(0, high=len(cfg.TRAIN.SCALES),
                                    size=num_images)
    assert(cfg.TRAIN.BATCH_SIZE % num_images == 0), \
        'num_images ({}) must divide BATCH_SIZE ({})'. \
        format(num_images, cfg.TRAIN.BATCH_SIZE)
    rois_per_image = cfg.TRAIN.BATCH_SIZE / num_images
    # fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
    fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)

    # Get the input image blob, formatted for caffe
    im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)

    blobs = {'data': im_blob}

    if cfg.TRAIN.HAS_RPN:
        assert len(im_scales) == 1, "Single batch only"
        assert len(roidb) == 1, "Single batch only"
        # gt boxes: (x1, y1, x2, y2, cls)
        gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]
        gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)
        gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]
        gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]
        blobs['gt_boxes'] = gt_boxes
        blobs['im_info'] = np.array(
            [[im_blob.shape[2], im_blob.shape[3], im_scales[0]]],
            dtype=np.float32)
    else: # not using RPN
        # Now, build the region of interest and label blobs
        rois_blob = np.zeros((0, 5), dtype=np.float32)
        labels_blob = np.zeros((0), dtype=np.float32)
        bbox_targets_blob = np.zeros((0, 4 * num_classes), dtype=np.float32)
        bbox_inside_blob = np.zeros(bbox_targets_blob.shape, dtype=np.float32)
        # all_overlaps = []
        for im_i in xrange(num_images):
            labels, overlaps, im_rois, bbox_targets, bbox_inside_weights \
                = _sample_rois(roidb[im_i], fg_rois_per_image, rois_per_image,
                               num_classes)

            # Add to RoIs blob
            rois = _project_im_rois(im_rois, im_scales[im_i])
            batch_ind = im_i * np.ones((rois.shape[0], 1))
            rois_blob_this_image = np.hstack((batch_ind, rois))
            rois_blob = np.vstack((rois_blob, rois_blob_this_image))

            # Add to labels, bbox targets, and bbox loss blobs
            labels_blob = np.hstack((labels_blob, labels))
            bbox_targets_blob = np.vstack((bbox_targets_blob, bbox_targets))
            bbox_inside_blob = np.vstack((bbox_inside_blob, bbox_inside_weights))
            # all_overlaps = np.hstack((all_overlaps, overlaps))

        # For debug visualizations
        # _vis_minibatch(im_blob, rois_blob, labels_blob, all_overlaps)

        blobs['rois'] = rois_blob
        blobs['labels'] = labels_blob

        if cfg.TRAIN.BBOX_REG:
            blobs['bbox_targets'] = bbox_targets_blob
            blobs['bbox_inside_weights'] = bbox_inside_blob
            blobs['bbox_outside_weights'] = \
                np.array(bbox_inside_blob > 0).astype(np.float32)

    return blobs

def _sample_rois(roidb, fg_rois_per_image, rois_per_image, num_classes):
    """Generate a random sample of RoIs comprising foreground and background
    examples.
    """
    # label = class RoI has max overlap with
    labels = roidb['max_classes']
    overlaps = roidb['max_overlaps']
    rois = roidb['boxes']

    # Select foreground RoIs as those with >= FG_THRESH overlap
    fg_inds = np.where(overlaps >= cfg.TRAIN.FG_THRESH)[0]
    # Guard against the case when an image has fewer than fg_rois_per_image
    # foreground RoIs
    fg_rois_per_this_image = np.minimum(fg_rois_per_image, fg_inds.size)
    # Sample foreground regions without replacement
    if fg_inds.size > 0:
        fg_inds = npr.choice(
                fg_inds, size=fg_rois_per_this_image, replace=False)

    # Select background RoIs as those within [BG_THRESH_LO, BG_THRESH_HI)
    bg_inds = np.where((overlaps < cfg.TRAIN.BG_THRESH_HI) &
                       (overlaps >= cfg.TRAIN.BG_THRESH_LO))[0]
    # Compute number of background RoIs to take from this image (guarding
    # against there being fewer than desired)
    bg_rois_per_this_image = rois_per_image - fg_rois_per_this_image
    bg_rois_per_this_image = np.minimum(bg_rois_per_this_image,
                                        bg_inds.size)
    # Sample foreground regions without replacement
    if bg_inds.size > 0:
        bg_inds = npr.choice(
                bg_inds, size=bg_rois_per_this_image, replace=False)

    # The indices that we're selecting (both fg and bg)
    keep_inds = np.append(fg_inds, bg_inds)
    # Select sampled values from various arrays:
    labels = labels[keep_inds]
    # Clamp labels for the background RoIs to 0
    labels[fg_rois_per_this_image:] = 0
    overlaps = overlaps[keep_inds]
    rois = rois[keep_inds]

    bbox_targets, bbox_inside_weights = _get_bbox_regression_labels(
            roidb['bbox_targets'][keep_inds, :], num_classes)

    return labels, overlaps, rois, bbox_targets, bbox_inside_weights

def _get_image_blob(roidb, scale_inds):
    """Builds an input blob from the images in the roidb at the specified
    scales.
    """
    num_images = len(roidb)
    processed_ims = []
    im_scales = []
    for i in xrange(num_images):
        im = cv2.imread(roidb[i]['image'])
        if roidb[i]['flipped']:
            im = im[:, ::-1, :]
        target_size = cfg.TRAIN.SCALES[scale_inds[i]]
        im, im_scale = prep_im_for_blob(im, cfg.PIXEL_MEANS, target_size,
                                        cfg.TRAIN.MAX_SIZE)
        im_scales.append(im_scale)
        processed_ims.append(im)

    # Create a blob to hold the input images
    blob = im_list_to_blob(processed_ims)

    return blob, im_scales

def _project_im_rois(im_rois, im_scale_factor):
    """Project image RoIs into the rescaled training image."""
    rois = im_rois * im_scale_factor
    return rois

def _get_bbox_regression_labels(bbox_target_data, num_classes):
    """Bounding-box regression targets are stored in a compact form in the
    roidb.

    This function expands those targets into the 4-of-4*K representation used
    by the network (i.e. only one class has non-zero targets). The loss weights
    are similarly expanded.

    Returns:
        bbox_target_data (ndarray): N x 4K blob of regression targets
        bbox_inside_weights (ndarray): N x 4K blob of loss weights
    """
    clss = bbox_target_data[:, 0]
    bbox_targets = np.zeros((clss.size, 4 * num_classes), dtype=np.float32)
    bbox_inside_weights = np.zeros(bbox_targets.shape, dtype=np.float32)
    inds = np.where(clss > 0)[0]
    # for ind in inds:
    #     cls = clss[ind]
    #     start = 4 * cls
    #     end = start + 4
    #     bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
    #     bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
    # return bbox_targets, bbox_inside_weights
    for ind in inds:
        ind = int(ind)
        cls = clss[ind]
        start = int(4 * cls)
        end = int(start + 4)
        bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
        bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
    return bbox_targets, bbox_inside_weights
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市邻吭,隨后出現(xiàn)的幾起案子餐弱,更是在濱河造成了極大的恐慌,老刑警劉巖囱晴,帶你破解...
    沈念sama閱讀 206,311評論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件膏蚓,死亡現(xiàn)場離奇詭異,居然都是意外死亡畸写,警方通過查閱死者的電腦和手機(jī)驮瞧,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,339評論 2 382
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來枯芬,“玉大人论笔,你說我怎么就攤上這事采郎。” “怎么了狂魔?”我有些...
    開封第一講書人閱讀 152,671評論 0 342
  • 文/不壞的土叔 我叫張陵蒜埋,是天一觀的道長。 經(jīng)常有香客問我最楷,道長整份,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 55,252評論 1 279
  • 正文 為了忘掉前任管嬉,我火速辦了婚禮皂林,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘蚯撩。我一直安慰自己础倍,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 64,253評論 5 371
  • 文/花漫 我一把揭開白布胎挎。 她就那樣靜靜地躺著沟启,像睡著了一般。 火紅的嫁衣襯著肌膚如雪犹菇。 梳的紋絲不亂的頭發(fā)上德迹,一...
    開封第一講書人閱讀 49,031評論 1 285
  • 那天,我揣著相機(jī)與錄音揭芍,去河邊找鬼胳搞。 笑死,一個(gè)胖子當(dāng)著我的面吹牛称杨,可吹牛的內(nèi)容都是我干的肌毅。 我是一名探鬼主播,決...
    沈念sama閱讀 38,340評論 3 399
  • 文/蒼蘭香墨 我猛地睜開眼姑原,長吁一口氣:“原來是場噩夢啊……” “哼悬而!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起锭汛,我...
    開封第一講書人閱讀 36,973評論 0 259
  • 序言:老撾萬榮一對情侶失蹤笨奠,失蹤者是張志新(化名)和其女友劉穎,沒想到半個(gè)月后唤殴,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體般婆,經(jīng)...
    沈念sama閱讀 43,466評論 1 300
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 35,937評論 2 323
  • 正文 我和宋清朗相戀三年朵逝,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了蔚袍。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 38,039評論 1 333
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡廉侧,死狀恐怖页响,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情段誊,我是刑警寧澤闰蚕,帶...
    沈念sama閱讀 33,701評論 4 323
  • 正文 年R本政府宣布,位于F島的核電站连舍,受9級特大地震影響没陡,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜索赏,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,254評論 3 307
  • 文/蒙蒙 一盼玄、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧潜腻,春花似錦埃儿、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,259評論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至威鹿,卻和暖如春剃斧,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背忽你。 一陣腳步聲響...
    開封第一講書人閱讀 31,485評論 1 262
  • 我被黑心中介騙來泰國打工幼东, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人科雳。 一個(gè)月前我還...
    沈念sama閱讀 45,497評論 2 354
  • 正文 我出身青樓根蟹,卻偏偏與公主長得像,于是被迫代替她去往敵國和親炸渡。 傳聞我的和親對象是個(gè)殘疾皇子娜亿,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 42,786評論 2 345

推薦閱讀更多精彩內(nèi)容