PASCAL VOC 增強版語義分割數(shù)據(jù)集制作 (For PyTorch)

1. 數(shù)據(jù)集簡介

PASCAL VOC 增強版語義分割數(shù)據(jù)集包括 PASCAL VOC 2012 數(shù)據(jù)集和 Semantic Boundaries Dataset 兩部分乖坠。SBD 數(shù)據(jù)集包含來自 PASCAL VOC 2011 數(shù)據(jù)集的11355張圖片的注釋,標(biāo)簽文件為 .mat 格式刀闷,類別與 PASCAL VOC 一致:

  • person
  • bird, cat, cow, dog, horse, sheep
  • aeroplane, bicycle, boat, bus, car, motorbike, train
  • bottle, chair, dining table, potted plant, sofa, tv/monitor

PASCAL VOC 2012 數(shù)據(jù)集文件目錄結(jié)構(gòu):

  • Annotations:包含xml文件熊泵,其中有檢測、分類等任務(wù)的標(biāo)簽
  • ImageSets:定義了訓(xùn)練集甸昏、驗證集與測試集的劃分
  • JPEGImages:原始圖像
  • SegmentationClass:語義分割的標(biāo)簽 (RGB)
  • SegmentationObject:實例分割的標(biāo)簽 (RGB)

Semantic Boundaries Dataset 文件目錄結(jié)構(gòu):

  • img:原始圖像
  • cls:語義分割的標(biāo)簽 (.mat)
  • inst:實例分割的標(biāo)簽 (.mat)
  • train.txt:包含 8498 個用于訓(xùn)練的圖像索引
  • val.txt:包含 2857 個用于驗證的圖像索引

PS: 此處主要介紹語義分割部分顽分。

2. 數(shù)據(jù)集的下載

  • PASCAL VOC 2012 數(shù)據(jù)集的下載地址:http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html
    下載得到的語義分割標(biāo)簽為 RGB 圖像,需要額外將其轉(zhuǎn)換為灰度圖像施蜜。

  • Semantic Boundaries Dataset的下載地址:
    http://home.bharathh.info/pubs/codes/SBD/download.html
    下載得到的分割標(biāo)簽為 .mat 格式卒蘸,需要將其轉(zhuǎn)換為與 PASCAL VOC 格式相同的灰度圖像。此外由于其所需的原始圖像均包含在 PASCAL VOC 2012 中,所以僅需要其標(biāo)簽部分缸沃。

數(shù)據(jù)集下載完成后解壓如下:

"""
VOCdevkit
    ├─VOC2012
    |   ├─Annotations
    |   ├─ImageSets
    |   ├─JPEGImages
    |   ├─SegmentationClass
    |   ├─SegmentationObject
    |   └─SemanticBoundaries
    |       ├─cls
    |       ├─inst
    |       ├─train.txt
    |       └─val.txt
    └─generate_aug_data.py
"""

3. 數(shù)據(jù)集的標(biāo)簽

labels = [
    #           class name            id    trainId         color
    Label(  'background'            ,  0 ,        0 , (   0,   0,   0) ),
    Label(  'aeroplane'             ,  1 ,        1 , ( 128,   0,   0) ),
    Label(  'bicycle'               ,  2 ,        2 , (   0, 128,   0) ),
    Label(  'bird'                  ,  3 ,        3 , ( 128, 128,   0) ),
    Label(  'boat'                  ,  4 ,        4 , (   0,   0, 128) ),
    Label(  'bottle'                ,  5 ,        5 , ( 128,   0, 128) ),
    Label(  'bus'                   ,  6 ,        6 , (   0, 128, 128) ),
    Label(  'car'                   ,  7 ,        7 , ( 128, 128, 128) ),
    Label(  'cat'                   ,  8 ,        8 , (  64,   0,   0) ),
    Label(  'chair'                 ,  9 ,        9 , ( 192,   0,   0) ),
    Label(  'cow'                   , 10 ,       10 , (  64, 128,   0) ),
    Label(  'dining table'          , 11 ,       11 , ( 192, 128,   0) ),
    Label(  'dog'                   , 12 ,       12 , (  64,   0, 128) ),
    Label(  'horse'                 , 13 ,       13 , ( 192,   0, 128) ),
    Label(  'motorbike'             , 14 ,       14 , (  64, 128, 128) ),
    Label(  'person'                , 15 ,       15 , ( 192, 128, 128) ),
    Label(  'potted plant'          , 16 ,       16 , (   0,  64,   0) ),
    Label(  'sheep'                 , 17 ,       17 , ( 128,  64,   0) ),
    Label(  'sofa'                  , 18 ,       18 , (   0, 192,   0) ),
    Label(  'train'                 , 19 ,       19 , ( 128, 192,   0) ),
    Label(  'tv monitor'            , 20 ,       20 , (   0,  64, 128) ),
    Label(  'bordering region'      , 255,       21 , ( 224, 224, 192) ),
]

PS: PASCAL VOC 分割數(shù)據(jù)集中將物體的邊界區(qū)域標(biāo)記為 bordering region恰起,表示這些區(qū)域可以是任何類別,在計算精度時將忽略該部分像素趾牧。Semantic Boundaries Dataset 中不含 bordering region 部分检盼。

4. 數(shù)據(jù)集生成

PS: 訓(xùn)練集、驗證集和測試集的劃分參照 deeplab翘单,即:
train = (sbd_train | sbd_val | voc_train) - voc_val吨枉,驗證集與測試集同 PASCAL VOC 2012
所生成的data list文件格式為:
2007_000032
2007_000039
2007_000063
2007_000068
2007_000121
2007_000170
...
image在JPEGImages目錄下,2007_000032.jpg
mask在SegmentationClassAug目錄下县恕,2007_000032_trainIds.png

import os
import sys
import re
import shutil
import numpy as np
from PIL import Image
import scipy.io
from collections import namedtuple


Label = namedtuple( 'Label' , [
    'name'        , # The identifier of this label, e.g. 'car', 'person', ... .
                    # We use them to uniquely name a class

    'id'          , # An integer ID that is associated with this label.
                    # The IDs are used to represent the label in ground truth images
                    # An ID of -1 means that this label does not have an ID and thus
                    # is ignored when creating ground truth images (e.g. license plate).
                    # Do not modify these IDs, since exactly these IDs are expected by the
                    # evaluation server.

    'trainId'     , # Feel free to modify these IDs as suitable for your method. Then create
                    # ground truth images with train IDs, using the tools provided in the
                    # 'preparation' folder. However, make sure to validate or submit results
                    # to our evaluation server using the regular IDs above!
                    # For trainIds, multiple labels might have the same ID. Then, these labels
                    # are mapped to the same class in the ground truth images. For the inverse
                    # mapping, we use the label that is defined first in the list below.
                    # For example, mapping all void-type classes to the same ID in training,
                    # might make sense for some approaches.
                    # Max value is 255!

    'color'       , # The color of this label
    ] )
labels = [
    #       name                     id    trainId   color
    Label(  'background'            ,  0 ,        0 , (   0,   0,   0) ),
    Label(  'aeroplane'             ,  1 ,        1 , ( 128,   0,   0) ),
    Label(  'bicycle'               ,  2 ,        2 , (   0, 128,   0) ),
    Label(  'bird'                  ,  3 ,        3 , ( 128, 128,   0) ),
    Label(  'boat'                  ,  4 ,        4 , (   0,   0, 128) ),
    Label(  'bottle'                ,  5 ,        5 , ( 128,   0, 128) ),
    Label(  'bus'                   ,  6 ,        6 , (   0, 128, 128) ),
    Label(  'car'                   ,  7 ,        7 , ( 128, 128, 128) ),
    Label(  'cat'                   ,  8 ,        8 , (  64,   0,   0) ),
    Label(  'chair'                 ,  9 ,        9 , ( 192,   0,   0) ),
    Label(  'cow'                   , 10 ,       10 , (  64, 128,   0) ),
    Label(  'dining table'          , 11 ,       11 , ( 192, 128,   0) ),
    Label(  'dog'                   , 12 ,       12 , (  64,   0, 128) ),
    Label(  'horse'                 , 13 ,       13 , ( 192,   0, 128) ),
    Label(  'motorbike'             , 14 ,       14 , (  64, 128, 128) ),
    Label(  'person'                , 15 ,       15 , ( 192, 128, 128) ),
    Label(  'potted plant'          , 16 ,       16 , (   0,  64,   0) ),
    Label(  'sheep'                 , 17 ,       17 , ( 128,  64,   0) ),
    Label(  'sofa'                  , 18 ,       18 , (   0, 192,   0) ),
    Label(  'train'                 , 19 ,       19 , ( 128, 192,   0) ),
    Label(  'tv monitor'            , 20 ,       20 , (   0,  64, 128) ),
    Label(  'bordering region'      , 255,       21 , ( 224, 224, 192) ),
]


####################################################################################
num_classes = 22
unspecified_id = num_classes - 1
train_id = list()
valid_labels = dict()
color_palette = list()
id_key = list()
id_mapping = list()
for label in labels:
    train_id.append(label.trainId)
    valid_labels[label.name] = label.id
    color_palette += list(label.color)
    # encoder: r<<16 + g<<8 + b
    id_key.append(label.trainId)
    encoder = (label.color[0] << 16) + (label.color[1] << 8) + label.color[2]
    id_mapping.append(encoder)
assert list(train_id) == sorted(train_id) and len(train_id) == num_classes
assert len(color_palette) == (num_classes * 3)
temp = list(zip(id_mapping, id_key))
temp.sort()
temp = list(zip(*temp))
id_key = np.array(temp[1], dtype='int')
id_mapping = np.array(temp[0], dtype='int')
print('valid class: ', valid_labels)
print('train_id: ', train_id)
print('unspecified_id: ', unspecified_id)
print('color_palette: ', color_palette)
print('id_key: ', id_key)
print('id_mapping: ', id_mapping)
"""
valid class:  {'background': 0, 'aeroplane': 1, 'bicycle': 2, 'bird': 3, 'boat': 4, 'bottle': 5, 'bus': 6, 'car': 7, 'cat': 8, 'chair': 9, 'cow': 10, 'dining table': 11, 'dog': 12, 'horse': 13, 'motorbike': 14, 'person': 15, 'potted plant': 16, 'sheep': 17, 'sofa': 18, 'train': 19, 'tv monitor': 20, 'bordering region': 255}
train_id:  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]
unspecified_id:  21
color_palette:  [0, 0, 0, 128, 0, 0, 0, 128, 0, 128, 128, 0, 0, 0, 128, 128, 0, 128, 0, 128, 128, 128, 128, 128, 64, 0, 0, 192, 0, 0, 64, 128, 0, 192, 128, 0, 64, 0, 128, 192, 0, 128, 64, 128, 128, 192, 128, 128, 0, 64, 0, 128, 64, 0, 0, 192, 0, 128, 192, 0, 0, 64, 128, 224, 224, 192]
id_key:  [ 0  4 16 20  2  6 18  8 12 10 14  1  5 17  3  7 19  9 13 11 15 21]
id_mapping:  [       0      128    16384    16512    32768    32896    49152  4194304
  4194432  4227072  4227200  8388608  8388736  8404992  8421376  8421504
  8437760 12582912 12583040 12615680 12615808 14737600]
"""


####################################################################################
# Path of PASCAL VOC 2012 Dataset + Semantic Boundaries Dataset
"""
VOCdevkit
    ├─VOC2012
    |   ├─ImageSets
    |   ├─JPEGImages
    |   ├─SegmentationClass
    |   └─SemanticBoundaries
    |       ├─cls
    |       ├─img
    |       └─inst
    └─generate_aug_data.py

data_list_file:
img_name0
img_name1
img_name2
...
"""
data_dir = os.path.abspath(os.path.dirname(__file__))
img_dir = os.path.join(data_dir, 'VOC2012/JPEGImages')
mat_dir = os.path.join(data_dir, 'VOC2012/SemanticBoundaries/cls')
voc_img_sets_dir = os.path.join(data_dir, 'VOC2012/ImageSets/Segmentation')
sbd_img_sets_dir = os.path.join(data_dir, 'VOC2012/SemanticBoundaries')
voc_mask_dir = os.path.join(data_dir, 'VOC2012/SegmentationClass')
aug_mask_dir = os.path.join(data_dir, 'VOC2012/SegmentationClassAug')
if not os.path.exists(aug_mask_dir):
    os.mkdir(aug_mask_dir)


####################################################################################
# convert .mat to .png
print()
i = 0
for mat_file in os.listdir(mat_dir):
    match = re.match(r'^(\d+_\d+).mat$', mat_file)
    if match:
        img = match.groups()[0]
        mat = scipy.io.loadmat(os.path.join(mat_dir, mat_file), mat_dtype=True, squeeze_me=True, struct_as_record=False)
        assert np.max(mat['GTcls'].Segmentation) < unspecified_id # no bordering region
        mask = Image.fromarray(mat['GTcls'].Segmentation)
        mask.save(os.path.join(aug_mask_dir, img + '_trainIds.png'))
        mask.putpalette(color_palette)
        mask.save(os.path.join(aug_mask_dir, img + '.png'))
        i += 1
        print('\rConverting .mat to .png: %d' % i, end='')
        sys.stdout.flush()

# copy voc to aug
print()
i = 0
for mask_file in os.listdir(voc_mask_dir):
    match = re.match(r'^(\d+_\d+).png$', mask_file)
    if match:
        img = match.groups()[0]
        # copy voc to aug
        shutil.copyfile(os.path.join(voc_mask_dir, mask_file), os.path.join(aug_mask_dir, mask_file))
        mask = np.array(Image.open(os.path.join(aug_mask_dir, mask_file)).convert('RGB'), dtype=np.uint32)
        # encoder: r<<16 + g<<8 + b
        encoder = np.left_shift(mask[:, :, 0], 16) + np.left_shift(mask[:, :, 1], 8) + mask[:, :, 2]
        index = np.digitize(encoder.ravel(), id_mapping, right=True)
        new_mask = id_key[index].reshape(encoder.shape).astype('uint8')
        new_mask = Image.fromarray(new_mask)
        new_mask.save(os.path.join(aug_mask_dir, img + '_trainIds.png'))
        i += 1
        print('\rCopying voc to aug: %d' % i, end=' ')
        sys.stdout.flush()


####################################################################################
print()
with open(os.path.join(voc_img_sets_dir, 'train.txt')) as f:
    img_sets = f.readlines()
    voc_train = set([i.split()[0] for i in img_sets])
    assert len(img_sets) == len(voc_train)
with open(os.path.join(voc_img_sets_dir, 'val.txt')) as f:
    img_sets = f.readlines()
    voc_val = set([i.split()[0] for i in img_sets])
    assert len(img_sets) == len(voc_val)
with open(os.path.join(sbd_img_sets_dir, 'train.txt')) as f:
    img_sets = f.readlines()
    sbd_train = set([i.split()[0] for i in img_sets])
    assert len(img_sets) == len(sbd_train)
with open(os.path.join(sbd_img_sets_dir, 'val.txt')) as f:
    img_sets = f.readlines()
    sbd_val = set([i.split()[0] for i in img_sets])
    assert len(img_sets) == len(sbd_val)

aug_train = (sbd_train | sbd_val | voc_train) - voc_val
aug_trainval = aug_train | voc_val
# check
for item in aug_trainval:
    img = os.path.join(img_dir, item + '.jpg')
    mask = os.path.join(aug_mask_dir, item + '_trainIds.png')
    assert os.path.exists(img) and os.path.exists(mask)

# create data list
with open(os.path.join(data_dir, 'train_aug.txt'), 'w') as train:
    for line in aug_train:
        train.write(str(line) + '\n')
    print('Created train data list ({}) in {}.'.format(len(aug_train), data_dir))
with open(os.path.join(data_dir, 'trainval_aug.txt'), 'w') as trainval:
    for line in aug_trainval:
        trainval.write(str(line) + '\n')
    print('Created trainval data list ({}) in {}.'.format(len(aug_trainval), data_dir))


####################################################################################
# compute class weights
print()
class_count = np.zeros(num_classes, dtype='int64')
# Get the total number of pixels in all train masks for each class
for i, img in enumerate(aug_train, 1):
    mask = np.array(Image.open(os.path.join(aug_mask_dir, img + '_trainIds.png')))
    class_count += np.histogram(mask, bins=np.arange(num_classes + 1))[0]
    print('\rComputing class weight: %d' % i, end=' ')
    sys.stdout.flush()

# including unspecified_id
class_p_unspecified = class_count / np.sum(class_count.astype(np.int64))
class_weight_unspecified = 1 / np.log(1.02 + class_p_unspecified)
# excluding unspecified_id
class_p = class_count[:-1] / np.sum(class_count[:-1].astype(np.int64))
class_weight = 1 / np.log(1.02 + class_p)

def array2string(array, format='%.6f'):
    return ', '.join([format % i for i in array])

print()
with open(os.path.join(data_dir, 'args_aug.txt'), 'w') as f:
    # valid_labels
    f.writelines('valid class:\n')
    f.writelines('{}\n\n'.format(valid_labels))
    # unspecified_id
    f.writelines('unspecified_id: {}\n\n'.format(unspecified_id))
    # train_id
    f.writelines('train_id:\n')
    f.writelines(array2string(train_id, '%d') + '\n\n')
    # class_count
    f.writelines('pixel counts for each class:\n')
    f.writelines(array2string(class_count, '%d') + '\n\n')
    # class_p_unspecified
    f.writelines('class probability including unspecified_id:\n')
    f.writelines(array2string(class_p_unspecified) + '\n\n')
    # class_weight_unspecified
    f.writelines('class weight including unspecified_id:\n')
    f.writelines(array2string(class_weight_unspecified) + '\n\n')
    # class_p
    f.writelines('class probability excluding unspecified_id:\n')
    f.writelines(array2string(class_p) + '\n\n')
    # class_weight
    f.writelines('class weight excluding unspecified_id:\n')
    f.writelines(array2string(class_weight) + '\n\n')
print('Generated class weight in {}.'.format(os.path.join(data_dir, 'args.txt')))
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末东羹,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子忠烛,更是在濱河造成了極大的恐慌属提,老刑警劉巖,帶你破解...
    沈念sama閱讀 218,284評論 6 506
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件美尸,死亡現(xiàn)場離奇詭異冤议,居然都是意外死亡,警方通過查閱死者的電腦和手機师坎,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,115評論 3 395
  • 文/潘曉璐 我一進店門恕酸,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人胯陋,你說我怎么就攤上這事蕊温。” “怎么了遏乔?”我有些...
    開封第一講書人閱讀 164,614評論 0 354
  • 文/不壞的土叔 我叫張陵义矛,是天一觀的道長。 經(jīng)常有香客問我盟萨,道長凉翻,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 58,671評論 1 293
  • 正文 為了忘掉前任捻激,我火速辦了婚禮制轰,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘胞谭。我一直安慰自己垃杖,他們只是感情好,可當(dāng)我...
    茶點故事閱讀 67,699評論 6 392
  • 文/花漫 我一把揭開白布丈屹。 她就那樣靜靜地躺著缩滨,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上脉漏,一...
    開封第一講書人閱讀 51,562評論 1 305
  • 那天,我揣著相機與錄音袖牙,去河邊找鬼侧巨。 笑死,一個胖子當(dāng)著我的面吹牛鞭达,可吹牛的內(nèi)容都是我干的司忱。 我是一名探鬼主播,決...
    沈念sama閱讀 40,309評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼畴蹭,長吁一口氣:“原來是場噩夢啊……” “哼坦仍!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起叨襟,我...
    開封第一講書人閱讀 39,223評論 0 276
  • 序言:老撾萬榮一對情侶失蹤繁扎,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后糊闽,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體梳玫,經(jīng)...
    沈念sama閱讀 45,668評論 1 314
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 37,859評論 3 336
  • 正文 我和宋清朗相戀三年右犹,在試婚紗的時候發(fā)現(xiàn)自己被綠了提澎。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 39,981評論 1 348
  • 序言:一個原本活蹦亂跳的男人離奇死亡念链,死狀恐怖盼忌,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情掂墓,我是刑警寧澤谦纱,帶...
    沈念sama閱讀 35,705評論 5 347
  • 正文 年R本政府宣布,位于F島的核電站梆暮,受9級特大地震影響服协,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜啦粹,卻給世界環(huán)境...
    茶點故事閱讀 41,310評論 3 330
  • 文/蒙蒙 一偿荷、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧唠椭,春花似錦跳纳、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,904評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春斗塘,著一層夾襖步出監(jiān)牢的瞬間赢织,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 33,023評論 1 270
  • 我被黑心中介騙來泰國打工馍盟, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留于置,地道東北人。 一個月前我還...
    沈念sama閱讀 48,146評論 3 370
  • 正文 我出身青樓贞岭,卻偏偏與公主長得像八毯,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子瞄桨,可洞房花燭夜當(dāng)晚...
    茶點故事閱讀 44,933評論 2 355

推薦閱讀更多精彩內(nèi)容