PASCAL VOC 增強版語義分割數(shù)據(jù)集制作 (For PyTorch)

1. 數(shù)據(jù)集簡介

PASCAL VOC 增強版語義分割數(shù)據(jù)集包括 PASCAL VOC 2012 數(shù)據(jù)集和 Semantic Boundaries Dataset 兩部分乖坠。SBD 數(shù)據(jù)集包含來自 PASCAL VOC 2011 數(shù)據(jù)集的11355張圖片的注釋，標(biāo)簽文件為 .mat 格式刀闷，類別與 PASCAL VOC 一致：

person
bird, cat, cow, dog, horse, sheep
aeroplane, bicycle, boat, bus, car, motorbike, train
bottle, chair, dining table, potted plant, sofa, tv/monitor

PASCAL VOC 2012 數(shù)據(jù)集文件目錄結(jié)構(gòu)：

Annotations：包含xml文件熊泵，其中有檢測、分類等任務(wù)的標(biāo)簽
ImageSets：定義了訓(xùn)練集甸昏、驗證集與測試集的劃分
JPEGImages：原始圖像
SegmentationClass：語義分割的標(biāo)簽 (RGB)
SegmentationObject：實例分割的標(biāo)簽 (RGB)

Semantic Boundaries Dataset 文件目錄結(jié)構(gòu)：

img：原始圖像
cls：語義分割的標(biāo)簽 (.mat)
inst：實例分割的標(biāo)簽 (.mat)
train.txt：包含 8498 個用于訓(xùn)練的圖像索引
val.txt：包含 2857 個用于驗證的圖像索引

PS: 此處主要介紹語義分割部分顽分。

2. 數(shù)據(jù)集的下載

PASCAL VOC 2012 數(shù)據(jù)集的下載地址：http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html
下載得到的語義分割標(biāo)簽為 RGB 圖像，需要額外將其轉(zhuǎn)換為灰度圖像施蜜。
Semantic Boundaries Dataset的下載地址：
http://home.bharathh.info/pubs/codes/SBD/download.html
下載得到的分割標(biāo)簽為 .mat 格式卒蘸，需要將其轉(zhuǎn)換為與 PASCAL VOC 格式相同的灰度圖像。此外由于其所需的原始圖像均包含在 PASCAL VOC 2012 中，所以僅需要其標(biāo)簽部分缸沃。

數(shù)據(jù)集下載完成后解壓如下：

"""
VOCdevkit
    ├─VOC2012
    |   ├─Annotations
    |   ├─ImageSets
    |   ├─JPEGImages
    |   ├─SegmentationClass
    |   ├─SegmentationObject
    |   └─SemanticBoundaries
    |       ├─cls
    |       ├─inst
    |       ├─train.txt
    |       └─val.txt
    └─generate_aug_data.py
"""

3. 數(shù)據(jù)集的標(biāo)簽

labels = [
    #           class name            id    trainId         color
    Label(  'background'            ,  0 ,        0 , (   0,   0,   0) ),
    Label(  'aeroplane'             ,  1 ,        1 , ( 128,   0,   0) ),
    Label(  'bicycle'               ,  2 ,        2 , (   0, 128,   0) ),
    Label(  'bird'                  ,  3 ,        3 , ( 128, 128,   0) ),
    Label(  'boat'                  ,  4 ,        4 , (   0,   0, 128) ),
    Label(  'bottle'                ,  5 ,        5 , ( 128,   0, 128) ),
    Label(  'bus'                   ,  6 ,        6 , (   0, 128, 128) ),
    Label(  'car'                   ,  7 ,        7 , ( 128, 128, 128) ),
    Label(  'cat'                   ,  8 ,        8 , (  64,   0,   0) ),
    Label(  'chair'                 ,  9 ,        9 , ( 192,   0,   0) ),
    Label(  'cow'                   , 10 ,       10 , (  64, 128,   0) ),
    Label(  'dining table'          , 11 ,       11 , ( 192, 128,   0) ),
    Label(  'dog'                   , 12 ,       12 , (  64,   0, 128) ),
    Label(  'horse'                 , 13 ,       13 , ( 192,   0, 128) ),
    Label(  'motorbike'             , 14 ,       14 , (  64, 128, 128) ),
    Label(  'person'                , 15 ,       15 , ( 192, 128, 128) ),
    Label(  'potted plant'          , 16 ,       16 , (   0,  64,   0) ),
    Label(  'sheep'                 , 17 ,       17 , ( 128,  64,   0) ),
    Label(  'sofa'                  , 18 ,       18 , (   0, 192,   0) ),
    Label(  'train'                 , 19 ,       19 , ( 128, 192,   0) ),
    Label(  'tv monitor'            , 20 ,       20 , (   0,  64, 128) ),
    Label(  'bordering region'      , 255,       21 , ( 224, 224, 192) ),
]

PS: PASCAL VOC 分割數(shù)據(jù)集中將物體的邊界區(qū)域標(biāo)記為 bordering region恰起，表示這些區(qū)域可以是任何類別，在計算精度時將忽略該部分像素趾牧。Semantic Boundaries Dataset 中不含 bordering region 部分检盼。

4. 數(shù)據(jù)集生成

PS: 訓(xùn)練集、驗證集和測試集的劃分參照 deeplab翘单，即：
train = (sbd_train | sbd_val | voc_train) - voc_val吨枉，驗證集與測試集同 PASCAL VOC 2012
所生成的data list文件格式為：
2007_000032
2007_000039
2007_000063
2007_000068
2007_000121
2007_000170
...
image在JPEGImages目錄下，2007_000032.jpg
mask在SegmentationClassAug目錄下县恕，2007_000032_trainIds.png

import os
import sys
import re
import shutil
import numpy as np
from PIL import Image
import scipy.io
from collections import namedtuple


Label = namedtuple( 'Label' , [
    'name'        , # The identifier of this label, e.g. 'car', 'person', ... .
                    # We use them to uniquely name a class

    'id'          , # An integer ID that is associated with this label.
                    # The IDs are used to represent the label in ground truth images
                    # An ID of -1 means that this label does not have an ID and thus
                    # is ignored when creating ground truth images (e.g. license plate).
                    # Do not modify these IDs, since exactly these IDs are expected by the
                    # evaluation server.

    'trainId'     , # Feel free to modify these IDs as suitable for your method. Then create
                    # ground truth images with train IDs, using the tools provided in the
                    # 'preparation' folder. However, make sure to validate or submit results
                    # to our evaluation server using the regular IDs above!
                    # For trainIds, multiple labels might have the same ID. Then, these labels
                    # are mapped to the same class in the ground truth images. For the inverse
                    # mapping, we use the label that is defined first in the list below.
                    # For example, mapping all void-type classes to the same ID in training,
                    # might make sense for some approaches.
                    # Max value is 255!

    'color'       , # The color of this label
    ] )
labels = [
    #       name                     id    trainId   color
    Label(  'background'            ,  0 ,        0 , (   0,   0,   0) ),
    Label(  'aeroplane'             ,  1 ,        1 , ( 128,   0,   0) ),
    Label(  'bicycle'               ,  2 ,        2 , (   0, 128,   0) ),
    Label(  'bird'                  ,  3 ,        3 , ( 128, 128,   0) ),
    Label(  'boat'                  ,  4 ,        4 , (   0,   0, 128) ),
    Label(  'bottle'                ,  5 ,        5 , ( 128,   0, 128) ),
    Label(  'bus'                   ,  6 ,        6 , (   0, 128, 128) ),
    Label(  'car'                   ,  7 ,        7 , ( 128, 128, 128) ),
    Label(  'cat'                   ,  8 ,        8 , (  64,   0,   0) ),
    Label(  'chair'                 ,  9 ,        9 , ( 192,   0,   0) ),
    Label(  'cow'                   , 10 ,       10 , (  64, 128,   0) ),
    Label(  'dining table'          , 11 ,       11 , ( 192, 128,   0) ),
    Label(  'dog'                   , 12 ,       12 , (  64,   0, 128) ),
    Label(  'horse'                 , 13 ,       13 , ( 192,   0, 128) ),
    Label(  'motorbike'             , 14 ,       14 , (  64, 128, 128) ),
    Label(  'person'                , 15 ,       15 , ( 192, 128, 128) ),
    Label(  'potted plant'          , 16 ,       16 , (   0,  64,   0) ),
    Label(  'sheep'                 , 17 ,       17 , ( 128,  64,   0) ),
    Label(  'sofa'                  , 18 ,       18 , (   0, 192,   0) ),
    Label(  'train'                 , 19 ,       19 , ( 128, 192,   0) ),
    Label(  'tv monitor'            , 20 ,       20 , (   0,  64, 128) ),
    Label(  'bordering region'      , 255,       21 , ( 224, 224, 192) ),
]


####################################################################################
num_classes = 22
unspecified_id = num_classes - 1
train_id = list()
valid_labels = dict()
color_palette = list()
id_key = list()
id_mapping = list()
for label in labels:
    train_id.append(label.trainId)
    valid_labels[label.name] = label.id
    color_palette += list(label.color)
    # encoder: r<<16 + g<<8 + b
    id_key.append(label.trainId)
    encoder = (label.color[0] << 16) + (label.color[1] << 8) + label.color[2]
    id_mapping.append(encoder)
assert list(train_id) == sorted(train_id) and len(train_id) == num_classes
assert len(color_palette) == (num_classes * 3)
temp = list(zip(id_mapping, id_key))
temp.sort()
temp = list(zip(*temp))
id_key = np.array(temp[1], dtype='int')
id_mapping = np.array(temp[0], dtype='int')
print('valid class: ', valid_labels)
print('train_id: ', train_id)
print('unspecified_id: ', unspecified_id)
print('color_palette: ', color_palette)
print('id_key: ', id_key)
print('id_mapping: ', id_mapping)
"""
valid class:  {'background': 0, 'aeroplane': 1, 'bicycle': 2, 'bird': 3, 'boat': 4, 'bottle': 5, 'bus': 6, 'car': 7, 'cat': 8, 'chair': 9, 'cow': 10, 'dining table': 11, 'dog': 12, 'horse': 13, 'motorbike': 14, 'person': 15, 'potted plant': 16, 'sheep': 17, 'sofa': 18, 'train': 19, 'tv monitor': 20, 'bordering region': 255}
train_id:  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]
unspecified_id:  21
color_palette:  [0, 0, 0, 128, 0, 0, 0, 128, 0, 128, 128, 0, 0, 0, 128, 128, 0, 128, 0, 128, 128, 128, 128, 128, 64, 0, 0, 192, 0, 0, 64, 128, 0, 192, 128, 0, 64, 0, 128, 192, 0, 128, 64, 128, 128, 192, 128, 128, 0, 64, 0, 128, 64, 0, 0, 192, 0, 128, 192, 0, 0, 64, 128, 224, 224, 192]
id_key:  [ 0  4 16 20  2  6 18  8 12 10 14  1  5 17  3  7 19  9 13 11 15 21]
id_mapping:  [       0      128    16384    16512    32768    32896    49152  4194304
  4194432  4227072  4227200  8388608  8388736  8404992  8421376  8421504
  8437760 12582912 12583040 12615680 12615808 14737600]
"""


####################################################################################
# Path of PASCAL VOC 2012 Dataset + Semantic Boundaries Dataset
"""
VOCdevkit
    ├─VOC2012
    |   ├─ImageSets
    |   ├─JPEGImages
    |   ├─SegmentationClass
    |   └─SemanticBoundaries
    |       ├─cls
    |       ├─img
    |       └─inst
    └─generate_aug_data.py

data_list_file:
img_name0
img_name1
img_name2
...
"""
data_dir = os.path.abspath(os.path.dirname(__file__))
img_dir = os.path.join(data_dir, 'VOC2012/JPEGImages')
mat_dir = os.path.join(data_dir, 'VOC2012/SemanticBoundaries/cls')
voc_img_sets_dir = os.path.join(data_dir, 'VOC2012/ImageSets/Segmentation')
sbd_img_sets_dir = os.path.join(data_dir, 'VOC2012/SemanticBoundaries')
voc_mask_dir = os.path.join(data_dir, 'VOC2012/SegmentationClass')
aug_mask_dir = os.path.join(data_dir, 'VOC2012/SegmentationClassAug')
if not os.path.exists(aug_mask_dir):
    os.mkdir(aug_mask_dir)


####################################################################################
# convert .mat to .png
print()
i = 0
for mat_file in os.listdir(mat_dir):
    match = re.match(r'^(\d+_\d+).mat$', mat_file)
    if match:
        img = match.groups()[0]
        mat = scipy.io.loadmat(os.path.join(mat_dir, mat_file), mat_dtype=True, squeeze_me=True, struct_as_record=False)
        assert np.max(mat['GTcls'].Segmentation) < unspecified_id # no bordering region
        mask = Image.fromarray(mat['GTcls'].Segmentation)
        mask.save(os.path.join(aug_mask_dir, img + '_trainIds.png'))
        mask.putpalette(color_palette)
        mask.save(os.path.join(aug_mask_dir, img + '.png'))
        i += 1
        print('\rConverting .mat to .png: %d' % i, end='')
        sys.stdout.flush()

# copy voc to aug
print()
i = 0
for mask_file in os.listdir(voc_mask_dir):
    match = re.match(r'^(\d+_\d+).png$', mask_file)
    if match:
        img = match.groups()[0]
        # copy voc to aug
        shutil.copyfile(os.path.join(voc_mask_dir, mask_file), os.path.join(aug_mask_dir, mask_file))
        mask = np.array(Image.open(os.path.join(aug_mask_dir, mask_file)).convert('RGB'), dtype=np.uint32)
        # encoder: r<<16 + g<<8 + b
        encoder = np.left_shift(mask[:, :, 0], 16) + np.left_shift(mask[:, :, 1], 8) + mask[:, :, 2]
        index = np.digitize(encoder.ravel(), id_mapping, right=True)
        new_mask = id_key[index].reshape(encoder.shape).astype('uint8')
        new_mask = Image.fromarray(new_mask)
        new_mask.save(os.path.join(aug_mask_dir, img + '_trainIds.png'))
        i += 1
        print('\rCopying voc to aug: %d' % i, end=' ')
        sys.stdout.flush()


####################################################################################
print()
with open(os.path.join(voc_img_sets_dir, 'train.txt')) as f:
    img_sets = f.readlines()
    voc_train = set([i.split()[0] for i in img_sets])
    assert len(img_sets) == len(voc_train)
with open(os.path.join(voc_img_sets_dir, 'val.txt')) as f:
    img_sets = f.readlines()
    voc_val = set([i.split()[0] for i in img_sets])
    assert len(img_sets) == len(voc_val)
with open(os.path.join(sbd_img_sets_dir, 'train.txt')) as f:
    img_sets = f.readlines()
    sbd_train = set([i.split()[0] for i in img_sets])
    assert len(img_sets) == len(sbd_train)
with open(os.path.join(sbd_img_sets_dir, 'val.txt')) as f:
    img_sets = f.readlines()
    sbd_val = set([i.split()[0] for i in img_sets])
    assert len(img_sets) == len(sbd_val)

aug_train = (sbd_train | sbd_val | voc_train) - voc_val
aug_trainval = aug_train | voc_val
# check
for item in aug_trainval:
    img = os.path.join(img_dir, item + '.jpg')
    mask = os.path.join(aug_mask_dir, item + '_trainIds.png')
    assert os.path.exists(img) and os.path.exists(mask)

# create data list
with open(os.path.join(data_dir, 'train_aug.txt'), 'w') as train:
    for line in aug_train:
        train.write(str(line) + '\n')
    print('Created train data list ({}) in {}.'.format(len(aug_train), data_dir))
with open(os.path.join(data_dir, 'trainval_aug.txt'), 'w') as trainval:
    for line in aug_trainval:
        trainval.write(str(line) + '\n')
    print('Created trainval data list ({}) in {}.'.format(len(aug_trainval), data_dir))


####################################################################################
# compute class weights
print()
class_count = np.zeros(num_classes, dtype='int64')
# Get the total number of pixels in all train masks for each class
for i, img in enumerate(aug_train, 1):
    mask = np.array(Image.open(os.path.join(aug_mask_dir, img + '_trainIds.png')))
    class_count += np.histogram(mask, bins=np.arange(num_classes + 1))[0]
    print('\rComputing class weight: %d' % i, end=' ')
    sys.stdout.flush()

# including unspecified_id
class_p_unspecified = class_count / np.sum(class_count.astype(np.int64))
class_weight_unspecified = 1 / np.log(1.02 + class_p_unspecified)
# excluding unspecified_id
class_p = class_count[:-1] / np.sum(class_count[:-1].astype(np.int64))
class_weight = 1 / np.log(1.02 + class_p)

def array2string(array, format='%.6f'):
    return ', '.join([format % i for i in array])

print()
with open(os.path.join(data_dir, 'args_aug.txt'), 'w') as f:
    # valid_labels
    f.writelines('valid class:\n')
    f.writelines('{}\n\n'.format(valid_labels))
    # unspecified_id
    f.writelines('unspecified_id: {}\n\n'.format(unspecified_id))
    # train_id
    f.writelines('train_id:\n')
    f.writelines(array2string(train_id, '%d') + '\n\n')
    # class_count
    f.writelines('pixel counts for each class:\n')
    f.writelines(array2string(class_count, '%d') + '\n\n')
    # class_p_unspecified
    f.writelines('class probability including unspecified_id:\n')
    f.writelines(array2string(class_p_unspecified) + '\n\n')
    # class_weight_unspecified
    f.writelines('class weight including unspecified_id:\n')
    f.writelines(array2string(class_weight_unspecified) + '\n\n')
    # class_p
    f.writelines('class probability excluding unspecified_id:\n')
    f.writelines(array2string(class_p) + '\n\n')
    # class_weight
    f.writelines('class weight excluding unspecified_id:\n')
    f.writelines(array2string(class_weight) + '\n\n')
print('Generated class weight in {}.'.format(os.path.join(data_dir, 'args.txt')))

最后編輯于：2020.04.19 13:56:52

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末东羹，一起剝皮案震驚了整個濱河市，隨后出現(xiàn)的幾起案子忠烛，更是在濱河造成了極大的恐慌属提，老刑警劉巖，帶你破解...
沈念sama閱讀 218,284評論 6贊 506
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件美尸，死亡現(xiàn)場離奇詭異冤议，居然都是意外死亡，警方通過查閱死者的電腦和手機师坎，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 93,115評論 3贊 395
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門恕酸，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人胯陋，你說我怎么就攤上這事蕊温。” “怎么了遏乔？”我有些...
開封第一講書人閱讀 164,614評論 0贊 354
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵义矛，是天一觀的道長。經(jīng)常有香客問我盟萨，道長凉翻，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 58,671評論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任捻激，我火速辦了婚禮制轰，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘胞谭。我一直安慰自己垃杖，他們只是感情好，可當(dāng)我...
茶點故事閱讀 67,699評論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布丈屹。她就那樣靜靜地躺著缩滨，像睡著了一般。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上脉漏，一...
開封第一講書人閱讀 51,562評論 1贊 305
城市分裂傳說
那天，我揣著相機與錄音袖牙，去河邊找鬼侧巨。笑死，一個胖子當(dāng)著我的面吹牛鞭达，可吹牛的內(nèi)容都是我干的司忱。我是一名探鬼主播，決...
沈念sama閱讀 40,309評論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼畴蹭，長吁一口氣：“原來是場噩夢啊……” “哼坦仍！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起叨襟，我...
開封第一講書人閱讀 39,223評論 0贊 276
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤繁扎，失蹤者是張志新（化名）和其女友劉穎，沒想到半個月后糊闽，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體梳玫，經(jīng)...
沈念sama閱讀 45,668評論 1贊 314
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 37,859評論 3贊 336
?白月光啟示錄
正文我和宋清朗相戀三年右犹，在試婚紗的時候發(fā)現(xiàn)自己被綠了提澎。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點故事閱讀 39,981評論 1贊 348
活死人
序言：一個原本活蹦亂跳的男人離奇死亡念链，死狀恐怖盼忌，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情掂墓，我是刑警寧澤谦纱，帶...
沈念sama閱讀 35,705評論 5贊 347
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站梆暮，受9級特大地震影響服协，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜啦粹，卻給世界環(huán)境...
茶點故事閱讀 41,310評論 3贊 330
男人毒藥：我在死后第九天來索命
文/蒙蒙一偿荷、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧唠椭，春花似錦跳纳、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,904評論 0贊 22
一樁弒父案寺庄，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至，卻和暖如春斗塘，著一層夾襖步出監(jiān)牢的瞬間赢织，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 33,023評論 1贊 270
情欲美人皮
我被黑心中介騙來泰國打工馍盟，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留于置，地道東北人。一個月前我還...
沈念sama閱讀 48,146評論 3贊 370
代替公主和親
正文我出身青樓贞岭，卻偏偏與公主長得像八毯，于是被迫代替她去往敵國和親。傳聞我的和親對象是個殘疾皇子瞄桨，可洞房花燭夜當(dāng)晚...
茶點故事閱讀 44,933評論 2贊 355

PASCAL VOC 增強版語義分割數(shù)據(jù)集制作 (For PyTorch)

1. 數(shù)據(jù)集簡介

2. 數(shù)據(jù)集的下載

3. 數(shù)據(jù)集的標(biāo)簽

4. 數(shù)據(jù)集生成

推薦閱讀更多精彩內(nèi)容