object_detectionAPI源碼閱讀筆記(13-稍微改變一下feature_extractor.)

作為學習記錄杉编,記錄一下復雜的心情。整個項目并沒看完,很多流程其實已經忘了王财,但。裕便。绒净。。偿衰。挂疆。。下翎。缤言。。视事。

極其簡單的修改奏效了

使用vgg.py的vgg_a作為整個模型的特征提取工具胆萧。

代碼如下:

def vgg_a(inputs,
          num_classes=1000,
          is_training=True,
          dropout_keep_prob=0.5,
          spatial_squeeze=True,
          scope='vgg_a',
          fc_conv_padding='VALID',
          global_pool=False,
          stride=8):
  with tf.variable_scope(scope, 'vgg_a', [inputs]) as sc:
    end_points_collection = sc.original_name_scope + '_end_points'
    # Collect outputs for conv2d, fully_connected and max_pool2d.
    with slim.arg_scope([slim.conv2d, slim.max_pool2d],
                        outputs_collections=end_points_collection):
      end_points = slim.utils.convert_collection_to_dict(end_points_collection)
      net = slim.repeat(inputs, 1, slim.conv2d, 64, [3, 3], scope='conv1')
      net = slim.max_pool2d(net, [2, 2], scope='pool1')
      end_points['pool1'] = net
      net = slim.repeat(net, 1, slim.conv2d, 128, [3, 3], scope='conv2')
      net = slim.max_pool2d(net, [2, 2], scope='pool2')
      end_points['pool2'] = net
      net = slim.repeat(net, 2, slim.conv2d, 256, [3, 3], scope='conv3')
      net = slim.max_pool2d(net, [2, 2], scope='pool3')
      end_points['pool3'] = net
      if stride == 8:
        return net, end_points
      net = slim.repeat(net, 2, slim.conv2d, 512, [3, 3], scope='conv4')
      net = slim.max_pool2d(net, [2, 2], scope='pool4')
      end_points['pool4'] = net
      if stride == 16:
        return net, end_points
      net = slim.repeat(net, 2, slim.conv2d, 512, [3, 3], scope='conv5')
      net = slim.max_pool2d(net, [2, 2], scope='pool5')
      end_points['pool5'] = net
      # Use conv2d instead of fully_connected layers.

      return net, end_points

這里的返回值就是最后一層的返回值和end_ponts的字典。

修改Resnet V1 Faster R-CNN implementation.

不敢在其他地方修改俐东,照著Resnet V1 Faster R-CNN implementation的test文件簡單修改了一下跌穗,其他地方是不敢動啊。



"""Resnet V1 Faster R-CNN implementation.

See "Deep Residual Learning for Image Recognition" by He et al., 2015.
https://arxiv.org/abs/1512.03385

Note: this implementation assumes that the classification checkpoint used
to finetune this model is trained using the same configuration as that of
the MSRA provided checkpoints
(see https://github.com/KaimingHe/deep-residual-networks), e.g., with
same preprocessing, batch norm scaling, etc.
"""
import tensorflow as tf

from object_detection.meta_architectures import faster_rcnn_meta_arch
from nets import resnet_utils
from nets import resnet_v1
from nets import vgg
slim = tf.contrib.slim


class FasterRCNNResnetV1FeatureExtractor(
    faster_rcnn_meta_arch.FasterRCNNFeatureExtractor):
  """Faster R-CNN Resnet V1 feature extractor implementation."""

  def __init__(self,
               architecture,
               resnet_model,
               is_training,
               first_stage_features_stride,
               batch_norm_trainable=False,
               reuse_weights=None,
               weight_decay=0.0):
    """Constructor.

    Args:
      architecture: Architecture name of the Resnet V1 model.
      resnet_model: Definition of the Resnet V1 model.
      is_training: See base class.
      first_stage_features_stride: See base class.
      batch_norm_trainable: See base class.
      reuse_weights: See base class.
      weight_decay: See base class.

    Raises:
      ValueError: If `first_stage_features_stride` is not 8 or 16.
    """
    if first_stage_features_stride != 8 and first_stage_features_stride != 16:
      raise ValueError('`first_stage_features_stride` must be 8 or 16.')
    self._architecture = architecture
    self._resnet_model = resnet_model
    super(FasterRCNNResnetV1FeatureExtractor, self).__init__(
        is_training, first_stage_features_stride, batch_norm_trainable,
        reuse_weights, weight_decay)

  def preprocess(self, resized_inputs):
    """Faster R-CNN Resnet V1 preprocessing.

    VGG style channel mean subtraction as described here:
    https://gist.github.com/ksimonyan/211839e770f7b538e2d8#file-readme-md

    Args:
      resized_inputs: A [batch, height_in, width_in, channels] float32 tensor
        representing a batch of images with values between 0 and 255.0.

    Returns:
      preprocessed_inputs: A [batch, height_out, width_out, channels] float32
        tensor representing a batch of images.

    """
    channel_means = [123.68, 116.779, 103.939]
    return resized_inputs - [[channel_means]]

  def _extract_proposal_features(self, preprocessed_inputs, scope):
    """Extracts first stage RPN features.

    Args:
      preprocessed_inputs: A [batch, height, width, channels] float32 tensor
        representing a batch of images.
      scope: A scope name.

    Returns:
      rpn_feature_map: A tensor with shape [batch, height, width, depth]
      activations: A dictionary mapping feature extractor tensor names to
        tensors

    Raises:
      InvalidArgumentError: If the spatial size of `preprocessed_inputs`
        (height or width) is less than 33.
      ValueError: If the created network is missing the required activation.
    """
    if len(preprocessed_inputs.get_shape().as_list()) != 4:
      raise ValueError('`preprocessed_inputs` must be 4 dimensional, got a '
                       'tensor of shape %s' % preprocessed_inputs.get_shape())
    shape_assert = tf.Assert(
        tf.logical_and(
            tf.greater_equal(tf.shape(preprocessed_inputs)[1], 33),
            tf.greater_equal(tf.shape(preprocessed_inputs)[2], 33)),
        ['image size must at least be 33 in both height and width.'])

    with tf.control_dependencies([shape_assert]):
      # Disables batchnorm for fine-tuning with smaller batch sizes.
      # TODO(chensun): Figure out if it is needed when image
      # batch size is bigger.
      with slim.arg_scope(vgg.vgg_arg_scope(weight_decay=self._weight_decay)):
        net, end_point = vgg.vgg_a(preprocessed_inputs,
         num_classes=1000,
         is_training=True,
         dropout_keep_prob=0.5,
         spatial_squeeze=True,
         scope='vgg_a',
         fc_conv_padding='VALID',
         global_pool=False,
         stride=self._first_stage_features_stride)

    #   with slim.arg_scope(
    #       resnet_utils.resnet_arg_scope(
    #           batch_norm_epsilon=1e-5,
    #           batch_norm_scale=True,
    #           weight_decay=self._weight_decay)):
    #     with tf.variable_scope(
    #         self._architecture, reuse=self._reuse_weights) as var_scope:
    #       _, activations = self._resnet_model(
    #           preprocessed_inputs,
    #           num_classes=None,
    #           is_training=self._train_batch_norm,
    #           global_pool=False,
    #           output_stride=self._first_stage_features_stride,
    #           spatial_squeeze=False,
    #           scope=var_scope)
    #
    # handle = scope + '/%s/block1' % self._architecture
    return net, end_point

  def _extract_box_classifier_features(self, proposal_feature_maps, scope):
    """Extracts second stage box classifier features.

    Args:
      proposal_feature_maps: A 4-D float tensor with shape
        [batch_size * self.max_num_proposals, crop_height, crop_width, depth]
        representing the feature map cropped to each proposal.
      scope: A scope name (unused).

    Returns:
      proposal_classifier_features: A 4-D float tensor with shape
        [batch_size * self.max_num_proposals, height, width, depth]
        representing box classifier features for each proposal.
    """
    with slim.arg_scope(vgg.vgg_arg_scope(weight_decay=self._weight_decay)):
        proposal_classifier_features = slim.conv2d(proposal_feature_maps, num_outputs=2048, kernel_size=3,scope='conv8')
    # with tf.variable_scope(self._architecture, reuse=self._reuse_weights):
    #   with slim.arg_scope(
    #       resnet_utils.resnet_arg_scope(
    #           batch_norm_epsilon=1e-5,
    #           batch_norm_scale=True,
    #           weight_decay=self._weight_decay)):
    #     with slim.arg_scope([slim.batch_norm],
    #                         is_training=self._train_batch_norm):
    #       blocks = [
    #           resnet_utils.Block('block2', resnet_v1.bottleneck, [{
    #               'depth': 2048,
    #               'depth_bottleneck': 512,
    #               'stride': 1
    #           }] * 3)
    #       ]
    #       proposal_classifier_features = resnet_utils.stack_blocks_dense(
    #           proposal_feature_maps, blocks)
    return proposal_classifier_features


class FasterRCNNResnet101FeatureExtractor(FasterRCNNResnetV1FeatureExtractor):
  """Faster R-CNN Resnet 101 feature extractor implementation."""

  def __init__(self,
               is_training,
               first_stage_features_stride,
               batch_norm_trainable=False,
               reuse_weights=None,
               weight_decay=0.0):
    """Constructor.

    Args:
      is_training: See base class.
      first_stage_features_stride: See base class.
      batch_norm_trainable: See base class.
      reuse_weights: See base class.
      weight_decay: See base class.

    Raises:
      ValueError: If `first_stage_features_stride` is not 8 or 16,
        or if `architecture` is not supported.
    """
    super(FasterRCNNResnet101FeatureExtractor, self).__init__(
        'resnet_v1_101', resnet_v1.resnet_v1_101, is_training,
        first_stage_features_stride, batch_norm_trainable,
        reuse_weights, weight_decay)

配置文件使用的是101的配置文件颓帝,所以其他的可以干掉了谋梭。

修改奏效了

模型

loss

pb文件

檢測圖片

這里只是簡單修改类少,因為在test文件中發(fā)現(xiàn)Resnet V1 Faster R-CNN implementation的第二階段竟然沒有改變輸入的維度。照著test文件進行修改了一下羹唠。

更新

今天又重新進行了改動,
創(chuàng)建faster_rcnn_vgg_feature_extractor.py

# Copyright 2017 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

# 

import tensorflow as tf

from object_detection.meta_architectures import faster_rcnn_meta_arch
from nets import vgg
slim = tf.contrib.slim


class FasterRCNNVGGFeatureExtractor(
    faster_rcnn_meta_arch.FasterRCNNFeatureExtractor):
  """Faster R-CNN Resnet V1 feature extractor implementation."""
  def __init__(self,
               is_training,
               first_stage_features_stride,
               batch_norm_trainable=False,
               reuse_weights=None,
               weight_decay=0.0):
    print('first_stage_features_stride',first_stage_features_stride)
    if first_stage_features_stride != 8 and first_stage_features_stride != 16:
      raise ValueError('`first_stage_features_stride` must be 8 or 16.')

    super(FasterRCNNVGGFeatureExtractor, self).__init__(
        is_training, first_stage_features_stride, batch_norm_trainable,
        reuse_weights, weight_decay)

  def preprocess(self, resized_inputs):

    channel_means = [123.68, 116.779, 103.939]
    return resized_inputs - [[channel_means]]

  def _extract_proposal_features(self, preprocessed_inputs, scope):

    if len(preprocessed_inputs.get_shape().as_list()) != 4:
      raise ValueError('`preprocessed_inputs` must be 4 dimensional, got a '
                       'tensor of shape %s' % preprocessed_inputs.get_shape())
    shape_assert = tf.Assert(
        tf.logical_and(
            tf.greater_equal(tf.shape(preprocessed_inputs)[1], 33),
            tf.greater_equal(tf.shape(preprocessed_inputs)[2], 33)),
        ['image size must at least be 33 in both height and width.'])

    with tf.control_dependencies([shape_assert]):
      # Disables batchnorm for fine-tuning with smaller batch sizes.
      # TODO(chensun): Figure out if it is needed when image
      # batch size is bigger.
      with slim.arg_scope(vgg.vgg_arg_scope(weight_decay=self._weight_decay)):
        net, end_point = vgg.vgg_a(preprocessed_inputs,
         num_classes=1000,
         is_training=True,
         dropout_keep_prob=0.5,
         spatial_squeeze=True,
         scope='vgg_a',
         fc_conv_padding='VALID',
         global_pool=False,
         stride=self._first_stage_features_stride)


    return net, end_point

  def _extract_box_classifier_features(self, proposal_feature_maps, scope):

    with slim.arg_scope(vgg.vgg_arg_scope(weight_decay=self._weight_decay)):

        net = slim.repeat(proposal_feature_maps, 2, slim.conv2d, 2048, [3, 3], scope='conv8')
        net = slim.max_pool2d(net, [2, 2], scope='pool8')
        proposal_classifier_features = slim.repeat(proposal_feature_maps, 3, slim.conv2d, 64, [3, 3], scope='conv9')


    return proposal_classifier_features

在model_builder.py中增加faster_vgg

from object_detection.models import faster_rcnn_vgg_feature_extractor as frcnn_vgg


# 進注冊
FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP = {
    'faster_rcnn_nas':
    frcnn_nas.FasterRCNNNASFeatureExtractor,
    'faster_rcnn_vgg':
    frcnn_vgg.FasterRCNNVGGFeatureExtractor,
    'faster_rcnn_pnas':
    frcnn_pnas.FasterRCNNPNASFeatureExtractor,
    'faster_rcnn_inception_resnet_v2':
    frcnn_inc_res.FasterRCNNInceptionResnetV2FeatureExtractor,
    'faster_rcnn_inception_v2':
    frcnn_inc_v2.FasterRCNNInceptionV2FeatureExtractor,
    'faster_rcnn_resnet50':
    frcnn_resnet_v1.FasterRCNNResnet50FeatureExtractor,
    'faster_rcnn_resnet101':
    frcnn_resnet_v1.FasterRCNNResnet101FeatureExtractor,
    'faster_rcnn_resnet152':
    frcnn_resnet_v1.FasterRCNNResnet152FeatureExtractor,
}

修改config文件

model {
  faster_rcnn {
    num_classes: 20
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_vgg'
      first_stage_features_stride: 16
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 16
        width_stride: 16
      }
    }
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.7
    first_stage_max_proposals: 300
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 14
    maxpool_kernel_size: 2
    maxpool_stride: 2
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        use_dropout: false
        dropout_keep_probability: 1.0
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 300
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  #fine_tune_checkpoint: "faster_rcnn_vgg_coco_11_06_2017/model.ckpt"
  from_detection_checkpoint: false
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "data/voc2012_trian.record"
  }
  label_map_path: "data/pascal_label_map.pbtxt"
}

eval_config: {
  num_examples: 100
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 100
  eval_interval_secs: 5
  #metrics_set:"pascal_voc_detection_metrics"
  metrics_set: "coco_detection_metrics"
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "data/voc2012_val.record"
  }
  label_map_path: "data/pascal_label_map.pbtxt"
  shuffle: false
  num_readers: 1
  num_epochs: 1

}

怎么調參需要慢慢弄娄昆。佩微。。稿黄。喊衫。。杆怕。

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯(lián)系作者
  • 序言:七十年代末族购,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子陵珍,更是在濱河造成了極大的恐慌寝杖,老刑警劉巖,帶你破解...
    沈念sama閱讀 212,718評論 6 492
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件互纯,死亡現(xiàn)場離奇詭異瑟幕,居然都是意外死亡,警方通過查閱死者的電腦和手機,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 90,683評論 3 385
  • 文/潘曉璐 我一進店門只盹,熙熙樓的掌柜王于貴愁眉苦臉地迎上來辣往,“玉大人,你說我怎么就攤上這事殖卑≌鞠鳎” “怎么了?”我有些...
    開封第一講書人閱讀 158,207評論 0 348
  • 文/不壞的土叔 我叫張陵孵稽,是天一觀的道長许起。 經常有香客問我,道長菩鲜,這世上最難降的妖魔是什么园细? 我笑而不...
    開封第一講書人閱讀 56,755評論 1 284
  • 正文 為了忘掉前任,我火速辦了婚禮接校,結果婚禮上猛频,老公的妹妹穿的比我還像新娘。我一直安慰自己蛛勉,他們只是感情好伦乔,可當我...
    茶點故事閱讀 65,862評論 6 386
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著董习,像睡著了一般烈和。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上皿淋,一...
    開封第一講書人閱讀 50,050評論 1 291
  • 那天招刹,我揣著相機與錄音,去河邊找鬼窝趣。 笑死疯暑,一個胖子當著我的面吹牛,可吹牛的內容都是我干的哑舒。 我是一名探鬼主播妇拯,決...
    沈念sama閱讀 39,136評論 3 410
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼洗鸵!你這毒婦竟也來了越锈?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 37,882評論 0 268
  • 序言:老撾萬榮一對情侶失蹤膘滨,失蹤者是張志新(化名)和其女友劉穎甘凭,沒想到半個月后,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體火邓,經...
    沈念sama閱讀 44,330評論 1 303
  • 正文 獨居荒郊野嶺守林人離奇死亡丹弱,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 36,651評論 2 327
  • 正文 我和宋清朗相戀三年德撬,在試婚紗的時候發(fā)現(xiàn)自己被綠了。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片躲胳。...
    茶點故事閱讀 38,789評論 1 341
  • 序言:一個原本活蹦亂跳的男人離奇死亡蜓洪,死狀恐怖,靈堂內的尸體忽然破棺而出坯苹,到底是詐尸還是另有隱情蝠咆,我是刑警寧澤,帶...
    沈念sama閱讀 34,477評論 4 333
  • 正文 年R本政府宣布北滥,位于F島的核電站,受9級特大地震影響闸翅,放射性物質發(fā)生泄漏再芋。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點故事閱讀 40,135評論 3 317
  • 文/蒙蒙 一坚冀、第九天 我趴在偏房一處隱蔽的房頂上張望济赎。 院中可真熱鬧,春花似錦记某、人聲如沸司训。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,864評論 0 21
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽壳猜。三九已至,卻和暖如春滑凉,著一層夾襖步出監(jiān)牢的瞬間统扳,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 32,099評論 1 267
  • 我被黑心中介騙來泰國打工畅姊, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留咒钟,地道東北人。 一個月前我還...
    沈念sama閱讀 46,598評論 2 362
  • 正文 我出身青樓若未,卻偏偏與公主長得像朱嘴,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子粗合,可洞房花燭夜當晚...
    茶點故事閱讀 43,697評論 2 351

推薦閱讀更多精彩內容