object_detectionAPI源碼閱讀筆記(7-FasterRCNNMetaArch類的詳解)-重發(fā)

FasterRCNNMetaArch的詳解:

上篇說到init函數(shù)就是對(duì)參數(shù)的提取如下:

  • init()
def __init__(self,
               is_training,
               num_classes,
               image_resizer_fn,
               feature_extractor,
               first_stage_only,
               first_stage_anchor_generator,
               first_stage_atrous_rate,
               first_stage_box_predictor_arg_scope,
               first_stage_box_predictor_kernel_size,
               first_stage_box_predictor_depth,
               first_stage_minibatch_size,
               first_stage_positive_balance_fraction,
               first_stage_nms_score_threshold,
               first_stage_nms_iou_threshold,
               first_stage_max_proposals,
               first_stage_localization_loss_weight,
               first_stage_objectness_loss_weight,
               initial_crop_size,
               maxpool_kernel_size,
               maxpool_stride,
               second_stage_mask_rcnn_box_predictor,
               second_stage_batch_size,
               second_stage_balance_fraction,
               second_stage_non_max_suppression_fn,
               second_stage_score_conversion_fn,
               second_stage_localization_loss_weight,
               second_stage_classification_loss_weight,
               second_stage_classification_loss,
               second_stage_mask_prediction_loss_weight=1.0,
               hard_example_miner=None,
               parallel_iterations=16):
    super(FasterRCNNMetaArch, self).__init__(num_classes=num_classes)
    
    
    # 檢查參數(shù)是否正確
    if is_training and second_stage_batch_size > first_stage_max_proposals:
      raise ValueError('second_stage_batch_size should be no greater than '
                       'first_stage_max_proposals.')
    if not isinstance(first_stage_anchor_generator,
                      grid_anchor_generator.GridAnchorGenerator):
      raise ValueError('first_stage_anchor_generator must be of type '
                       'grid_anchor_generator.GridAnchorGenerator.')
    
    # 獲取參數(shù)柜某,這些都是設(shè)置參數(shù)
    self._is_training = is_training
    self._image_resizer_fn = image_resizer_fn  # 圖片resize函數(shù)
    self._feature_extractor = feature_extractor # feature_extractor提取函數(shù)耕魄,在上面有介紹
    self._first_stage_only = first_stage_only # 是否只進(jìn)行區(qū)域提取

    # The first class is reserved as background.
    # 設(shè)置第一個(gè)類為背景類
    unmatched_cls_target = tf.constant(
        [1] + self._num_classes * [0], dtype=tf.float32)
    # target_assigner是創(chuàng)建任務(wù)的類
    self._proposal_target_assigner = target_assigner.create_target_assigner(
        'FasterRCNN', 'proposal')
    self._detector_target_assigner = target_assigner.create_target_assigner(
        'FasterRCNN', 'detection', unmatched_cls_target=unmatched_cls_target)
    # Both proposal and detector target assigners use the same box coder
    self._box_coder = self._proposal_target_assigner.box_coder

    # (First stage) Region proposal network parameters
    # 獲取第一階段的anchor_generator生成器
    self._first_stage_anchor_generator = first_stage_anchor_generator
    self._first_stage_atrous_rate = first_stage_atrous_rate
    self._first_stage_box_predictor_arg_scope = (
        first_stage_box_predictor_arg_scope)
    self._first_stage_box_predictor_kernel_size = (
        first_stage_box_predictor_kernel_size)
    self._first_stage_box_predictor_depth = first_stage_box_predictor_depth
    self._first_stage_minibatch_size = first_stage_minibatch_size
    # 在這里進(jìn)行正負(fù)樣本的采樣
    self._first_stage_sampler = sampler.BalancedPositiveNegativeSampler(
        positive_fraction=first_stage_positive_balance_fraction)
    self._first_stage_box_predictor = box_predictor.ConvolutionalBoxPredictor(
        self._is_training, num_classes=1,
        conv_hyperparams=self._first_stage_box_predictor_arg_scope,
        min_depth=0, max_depth=0, num_layers_before_predictor=0,
        use_dropout=False, dropout_keep_prob=1.0, kernel_size=1,
        box_code_size=self._box_coder.code_size)

    # 第一階段的非極大抑制值毛萌,iou济丘,最大推薦區(qū)域數(shù)量
    self._first_stage_nms_score_threshold = first_stage_nms_score_threshold
    self._first_stage_nms_iou_threshold = first_stage_nms_iou_threshold
    self._first_stage_max_proposals = first_stage_max_proposals

    # 產(chǎn)生WeightedSmoothL1LocalizationLoss和WeightedSoftmaxClassificationLoss
    self._first_stage_localization_loss = (
        losses.WeightedSmoothL1LocalizationLoss(anchorwise_output=True))
    self._first_stage_objectness_loss = (
        losses.WeightedSoftmaxClassificationLoss(anchorwise_output=True))
    self._first_stage_loc_loss_weight = first_stage_localization_loss_weight
    self._first_stage_obj_loss_weight = first_stage_objectness_loss_weight

    # Per-region cropping parameters
    # 設(shè)置ROI的大小
    self._initial_crop_size = initial_crop_size
    self._maxpool_kernel_size = maxpool_kernel_size
    self._maxpool_stride = maxpool_stride

    self._mask_rcnn_box_predictor = second_stage_mask_rcnn_box_predictor
    
    # 還是提取第二階段的參數(shù),
    self._second_stage_batch_size = second_stage_batch_size
    self._second_stage_sampler = sampler.BalancedPositiveNegativeSampler(
        positive_fraction=second_stage_balance_fraction)

    # 第二階段非極大抑制值跑筝,iou痰催,最大推薦區(qū)域數(shù)量
    self._second_stage_nms_fn = second_stage_non_max_suppression_fn
    self._second_stage_score_conversion_fn = second_stage_score_conversion_fn

    # 第二階段的loss
    self._second_stage_localization_loss = (
        losses.WeightedSmoothL1LocalizationLoss(anchorwise_output=True))
    self._second_stage_classification_loss = second_stage_classification_loss
    self._second_stage_mask_loss = (
        losses.WeightedSigmoidClassificationLoss(anchorwise_output=True))
    self._second_stage_loc_loss_weight = second_stage_localization_loss_weight
    self._second_stage_cls_loss_weight = second_stage_classification_loss_weight
    self._second_stage_mask_loss_weight = (
        second_stage_mask_prediction_loss_weight)
    self._hard_example_miner = hard_example_miner
    self._parallel_iterations = parallel_iterations
  • FasterRCNNMetaArch的內(nèi)部屬性
 @property
  def first_stage_feature_extractor_scope(self):
    return 'FirstStageFeatureExtractor'

  @property
  def second_stage_feature_extractor_scope(self):
    return 'SecondStageFeatureExtractor'

  @property
  def first_stage_box_predictor_scope(self):
    return 'FirstStageBoxPredictor'

  @property
  def second_stage_box_predictor_scope(self):
    return 'SecondStageBoxPredictor'

  @property
  def max_num_proposals(self):
    if self._is_training and not self._hard_example_miner:
      return self._second_stage_batch_size
    return self._first_stage_max_proposals

其中max_num_proposals():
是的batch中每張圖的最大的建議區(qū)域的數(shù)量屬性站欺。
在訓(xùn)練時(shí)如果hardexample miner沒有設(shè)置使用second_stage_batch_size 否則使用first_stage_max_proposals而在進(jìn)行推斷時(shí)使用的總是first_stage_max_proposals.

  • preprocess(self, inputs)
def preprocess(self, inputs):
    if inputs.dtype is not tf.float32:
      raise ValueError('`preprocess` expects a tf.float32 tensor')
    with tf.name_scope('Preprocessor'):
      resized_inputs = tf.map_fn(self._image_resizer_fn,
                                 elems=inputs,
                                 dtype=tf.float32,
                                 parallel_iterations=self._parallel_iterations)
      return self._feature_extractor.preprocess(resized_inputs)

這是調(diào)用FasterRCNNFeatureExtractor.preprocess()函數(shù)進(jìn)行負(fù)責(zé)額外的預(yù)處理(例如縮放像素值在[-1,1]中),感覺很一般啊甜癞。請(qǐng)看object_detectionAPI源碼閱讀筆記(8-faster_rcnn_inception_resnet_v2_feature_extractor.py)

  • predict(self, preprocessed_inputs)
def predict(self, preprocessed_inputs):
    (rpn_box_predictor_features, rpn_features_to_crop, anchors_boxlist,
     image_shape) = self._extract_rpn_feature_maps(preprocessed_inputs)
    (rpn_box_encodings, rpn_objectness_predictions_with_background
    ) = self._predict_rpn_proposals(rpn_box_predictor_features)

    # The Faster R-CNN paper recommends pruning anchors that venture outside
    # the image window at training time and clipping at inference time.
    clip_window = tf.to_float(tf.stack([0, 0, image_shape[1], image_shape[2]]))
    if self._is_training:
      (rpn_box_encodings, rpn_objectness_predictions_with_background,
       anchors_boxlist) = self._remove_invalid_anchors_and_predictions(
           rpn_box_encodings, rpn_objectness_predictions_with_background,
           anchors_boxlist, clip_window)
    else:
      anchors_boxlist = box_list_ops.clip_to_window(
          anchors_boxlist, clip_window)

    anchors = anchors_boxlist.get()
    prediction_dict = {
        'rpn_box_predictor_features': rpn_box_predictor_features,
        'rpn_features_to_crop': rpn_features_to_crop,
        'image_shape': image_shape,
        'rpn_box_encodings': rpn_box_encodings,
        'rpn_objectness_predictions_with_background':
        rpn_objectness_predictions_with_background,
        'anchors': anchors
    }

    if not self._first_stage_only:
      prediction_dict.update(self._predict_second_stage(
          rpn_box_encodings,
          rpn_objectness_predictions_with_background,
          rpn_features_to_crop,
          anchors, image_shape))
    return prediction_dict

這個(gè)函數(shù)是對(duì)preprocessed_inputs處理的圖像進(jìn)行前向處理夕晓,產(chǎn)生最原始的預(yù)測(cè)。如果 first_stage_only 被設(shè)置為True,這個(gè)方程就會(huì)輸出RPN predictions (un-postprocessed).否則就會(huì)輸出first stage RPN predictions和second stage box classifier predictions.
其他需要注意的地方:
+ Anchor pruning vs. clipping: 按照Faster R-CNN paper建議, 在訓(xùn)練時(shí)刪掉錨點(diǎn)邊界超出圖片的邊界而在進(jìn)行推斷時(shí)(預(yù)測(cè))我們僅僅修建這些錨點(diǎn)悠咱。
+ Proposal padding:每一個(gè)批次的區(qū)域建議數(shù)量都會(huì)被擴(kuò)充到self._max_num_proposals(在訓(xùn)練時(shí)蒸辆,一般時(shí)正樣本不夠,拿負(fù)樣本進(jìn)行填充析既,假如self._max_num_proposals==128躬贡,正負(fù)樣本相加必需等于這個(gè)數(shù))所以每批次的batch size 是一樣的。

Args:
      preprocessed_inputs: shape=[batch, height, width, channels] 的一張經(jīng)過preprocessed處理的圖片眼坏。
Returns:
      prediction_dict: a dictionary holding "raw" prediction tensors:
        1) rpn_box_predictor_features: shape = [batch_size, height, width, depth] 由rpn_box_predictor_features提取的一張?zhí)卣鲌D拂玻。是用來預(yù)測(cè)proposal boxes和相應(yīng)的目標(biāo)為前景還是背景的得分(背景得分只有[0,1])。
        2) rpn_features_to_crop: shape=[batch_size, height, width, depth]用于給RPN的特征圖。(RPN時(shí)給任意尺寸一張?zhí)卣鲌D輸出固定的尺寸的特征圖)
        3) image_shape: a 1-D 代表input  image shape.
        4) rpn_box_encodings: shape= [batch_size, num_anchors, self._box_coder.code_size]檐蚜,是預(yù)測(cè)框的形狀
        5) rpn_objectness_predictions_with_background: shape=[batch_size, num_anchors, 2]每個(gè)錨點(diǎn)的類別 (logits)魄懂,包含了背景預(yù)測(cè)在 (at class index 0).
        6) anchors: shape = [num_anchors, 4] 代表first stage RPN (絕對(duì)坐標(biāo))的坐標(biāo).   `num_anchors` 在訓(xùn)練和推斷時(shí)是不一樣的。
      --------------------------------------------------------------------------
      接下來是第二階段才會(huì)返回的值闯第。
        7) refined_box_encodings: shape=[total_num_proposals, num_classes, 4] 經(jīng)過過濾的最終編碼坐標(biāo)逢渔,total_num_proposals=batch_size*self._max_num_proposals
        8) class_predictions_with_background: shape=[total_num_proposals, num_classes + 1] 每個(gè)盒子對(duì)類別的預(yù)測(cè),total_num_proposals=batch_size*self._max_num_proposals.包含背景類別(at class index 0).
        9) num_proposals:  `self.max_num_proposals` .
        10) proposal_boxes: shape=[batch_size, self.max_num_proposals, 4]使用絕對(duì)左邊解碼proposal_boxes.
        11) mask_predictions: (optional) shape=[total_num_padded_proposals, num_classes, mask_height, mask_width]目標(biāo)的掩碼.

  • postprocess(self, prediction_dict)
def postprocess(self, prediction_dict):
    with tf.name_scope('FirstStagePostprocessor'):
      image_shape = prediction_dict['image_shape']
      if self._first_stage_only:
        proposal_boxes, proposal_scores, num_proposals = self._postprocess_rpn(
            prediction_dict['rpn_box_encodings'],
            prediction_dict['rpn_objectness_predictions_with_background'],
            prediction_dict['anchors'],
            image_shape)
        return {
            'detection_boxes': proposal_boxes,
            'detection_scores': proposal_scores,
            'num_detections': tf.to_float(num_proposals)
        }
    with tf.name_scope('SecondStagePostprocessor'):
      mask_predictions = prediction_dict.get(box_predictor.MASK_PREDICTIONS)
      detections_dict = self._postprocess_box_classifier(
          prediction_dict['refined_box_encodings'],
          prediction_dict['class_predictions_with_background'],
          prediction_dict['proposal_boxes'],
          prediction_dict['num_proposals'],
          image_shape,
          mask_predictions=mask_predictions)
      return detections_dict

這個(gè)方程把原始的預(yù)測(cè)輸出傳換成最終的檢測(cè)結(jié)果,預(yù)測(cè)分?jǐn)?shù)是基于logits的乡括,first_stage_only=True返回時(shí)來自first stage RPN(每張圖片self.max_num_proposals 個(gè)區(qū)域)肃廓,否則結(jié)果來自two-stage(每張圖片self._max_detections 個(gè)區(qū)域)結(jié)果是被轉(zhuǎn)換成multiclass detections
Args:
prediction_dict: 是一個(gè)包含所有預(yù)測(cè)結(jié)果的字典。當(dāng)first_stage_only=True,字典包含 (rpn_box_encodings,rpn_objectness_predictions_with_background, rpn_features_to_crop, image_shape, anchors )否則在字典中還會(huì)有(refined_box_encodings,class_predictions_with_background, num_proposals,proposal_boxes , optionally, mask_predictions
Returns:
detections: a dictionary containing the following fields
detection_boxes: [batch, max_detection, 4]诲泌,檢測(cè)框的坐標(biāo)
detection_scores: [batch, max_detections]盲赊,檢測(cè)款的分?jǐn)?shù)
detection_classes: [batch, max_detections],檢測(cè)框的類別
當(dāng)(rpn_mode=False)時(shí)才會(huì)創(chuàng)建敷扫。
num_detections: [batch]

  • loss(self, prediction_dict, scope=None)
def loss(self, prediction_dict, scope=None):
 
    with tf.name_scope(scope, 'Loss', prediction_dict.values()):
      (groundtruth_boxlists, groundtruth_classes_with_background_list,
       groundtruth_masks_list
      ) = self._format_groundtruth_data(prediction_dict['image_shape'])
      loss_dict = self._loss_rpn(
          prediction_dict['rpn_box_encodings'],
          prediction_dict['rpn_objectness_predictions_with_background'],
          prediction_dict['anchors'],
          groundtruth_boxlists,
          groundtruth_classes_with_background_list)
      if not self._first_stage_only:
        loss_dict.update(
            self._loss_box_classifier(
                prediction_dict['refined_box_encodings'],
                prediction_dict['class_predictions_with_background'],
                prediction_dict['proposal_boxes'],
                prediction_dict['num_proposals'],
                groundtruth_boxlists,
                groundtruth_classes_with_background_list,
                prediction_dict['image_shape'],
                prediction_dict.get('mask_predictions'),
                groundtruth_masks_list,
            ))
    return loss_dict

如果first_stage_only=True只計(jì)算(rpn_localization_lossrpn_objectness_loss)的損失哀蘑。否則計(jì)算所有的損失。
Args:
prediction_dict: 是一個(gè)包含所有預(yù)測(cè)結(jié)果的字典葵第。當(dāng)first_stage_only=True,字典包含 (rpn_box_encodings,rpn_objectness_predictions_with_background, rpn_features_to_crop, image_shape, anchors )否則在字典中還會(huì)有(refined_box_encodings,class_predictions_with_background, num_proposals,proposal_boxes , optionally, mask_predictions
scope: 參數(shù)的空間
Returns:
一個(gè)字典包含(first_stage_localization_loss,
first_stage_objectness_loss, second_stage_localization_loss,
second_stage_classification_loss)

  • restore_map(self, from_detection_checkpoint=True)
def restore_map(self, from_detection_checkpoint=True):
    
    if not from_detection_checkpoint:
      return self._feature_extractor.restore_from_classification_checkpoint_fn(
          self.first_stage_feature_extractor_scope,
          self.second_stage_feature_extractor_scope)

    variables_to_restore = tf.global_variables()
    variables_to_restore.append(slim.get_or_create_global_step())
    # Only load feature extractor variables to be consistent with loading from
    # a classification checkpoint.
    feature_extractor_variables = tf.contrib.framework.filter_variables(
        variables_to_restore,
        include_patterns=[self.first_stage_feature_extractor_scope,
                          self.second_stage_feature_extractor_scope])
    return {var.op.name: var for var in feature_extractor_variables}

從外部的檢查點(diǎn)中導(dǎo)入?yún)?shù)绘迁。
Args:
from_detection_checkpoint: 是否導(dǎo)入完整的檢測(cè)模型檢查點(diǎn)或者從分類模型中導(dǎo)入檢查點(diǎn)完成預(yù)訓(xùn)練的初始化
Returns:
所有從檢查點(diǎn)恢復(fù)的參數(shù)的名字

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市卒密,隨后出現(xiàn)的幾起案子缀台,更是在濱河造成了極大的恐慌,老刑警劉巖哮奇,帶你破解...
    沈念sama閱讀 211,265評(píng)論 6 490
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件膛腐,死亡現(xiàn)場(chǎng)離奇詭異,居然都是意外死亡鼎俘,警方通過查閱死者的電腦和手機(jī)哲身,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 90,078評(píng)論 2 385
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來贸伐,“玉大人勘天,你說我怎么就攤上這事∽叫希” “怎么了脯丝?”我有些...
    開封第一講書人閱讀 156,852評(píng)論 0 347
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)歌逢。 經(jīng)常有香客問我巾钉,道長(zhǎng),這世上最難降的妖魔是什么秘案? 我笑而不...
    開封第一講書人閱讀 56,408評(píng)論 1 283
  • 正文 為了忘掉前任砰苍,我火速辦了婚禮潦匈,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘赚导。我一直安慰自己茬缩,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 65,445評(píng)論 5 384
  • 文/花漫 我一把揭開白布吼旧。 她就那樣靜靜地躺著凰锡,像睡著了一般。 火紅的嫁衣襯著肌膚如雪圈暗。 梳的紋絲不亂的頭發(fā)上掂为,一...
    開封第一講書人閱讀 49,772評(píng)論 1 290
  • 那天,我揣著相機(jī)與錄音员串,去河邊找鬼勇哗。 笑死,一個(gè)胖子當(dāng)著我的面吹牛寸齐,可吹牛的內(nèi)容都是我干的欲诺。 我是一名探鬼主播,決...
    沈念sama閱讀 38,921評(píng)論 3 406
  • 文/蒼蘭香墨 我猛地睜開眼渺鹦,長(zhǎng)吁一口氣:“原來是場(chǎng)噩夢(mèng)啊……” “哼扰法!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起毅厚,我...
    開封第一講書人閱讀 37,688評(píng)論 0 266
  • 序言:老撾萬榮一對(duì)情侶失蹤塞颁,失蹤者是張志新(化名)和其女友劉穎,沒想到半個(gè)月后卧斟,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體殴边,經(jīng)...
    沈念sama閱讀 44,130評(píng)論 1 303
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 36,467評(píng)論 2 325
  • 正文 我和宋清朗相戀三年珍语,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片竖幔。...
    茶點(diǎn)故事閱讀 38,617評(píng)論 1 340
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡板乙,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出拳氢,到底是詐尸還是另有隱情募逞,我是刑警寧澤,帶...
    沈念sama閱讀 34,276評(píng)論 4 329
  • 正文 年R本政府宣布馋评,位于F島的核電站放接,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏留特。R本人自食惡果不足惜纠脾,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,882評(píng)論 3 312
  • 文/蒙蒙 一玛瘸、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧苟蹈,春花似錦糊渊、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,740評(píng)論 0 21
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至菱鸥,卻和暖如春宗兼,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背氮采。 一陣腳步聲響...
    開封第一講書人閱讀 31,967評(píng)論 1 265
  • 我被黑心中介騙來泰國(guó)打工针炉, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人扳抽。 一個(gè)月前我還...
    沈念sama閱讀 46,315評(píng)論 2 360
  • 正文 我出身青樓篡帕,卻偏偏與公主長(zhǎng)得像,于是被迫代替她去往敵國(guó)和親贸呢。 傳聞我的和親對(duì)象是個(gè)殘疾皇子镰烧,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 43,486評(píng)論 2 348

推薦閱讀更多精彩內(nèi)容