常用數(shù)據(jù)集介紹及轉(zhuǎn)換

研究背景

在深度學(xué)習(xí)中常用的數(shù)據(jù)集進(jìn)行歸納和總結(jié)

1叹卷、COCO 數(shù)據(jù)集

COCO(Common Objects in Context)是一個(gè)新的圖像識(shí)別谜慌、分割和圖像語義數(shù)據(jù)集地熄，是一個(gè)大規(guī)模的圖像識(shí)別镶奉、分割绪抛、標(biāo)注數(shù)據(jù)集像屋。它可以用于多種競(jìng)賽怕犁，與本領(lǐng)域最相關(guān)的是檢測(cè)部分，因?yàn)槠湟徊糠质侵铝τ诮鉀Q分割問題的己莺。

COCO2014數(shù)據(jù)集類別匯總

coco目標(biāo)檢測(cè)數(shù)據(jù)集標(biāo)注目標(biāo)信息采用的是數(shù)據(jù)格式是json奏甫，其內(nèi)容本質(zhì)是一種字典結(jié)構(gòu)，字典堆棧和列表信息內(nèi)容維護(hù)凌受。
coco里面的id和類名字對(duì)應(yīng)：總共80類阵子，但id號(hào)到90

    person  # 1
    vehicle 交通工具 #8
        {bicycle
         car
         motorcycle
         airplane
         bus
         train
         truck
         boat}
    outdoor  #5
        {traffic light
        fire hydrant
        stop sign
        parking meter
        bench}
    animal  #10
        {bird
        cat
        dog
        horse
        sheep
        cow
        elephant
        bear
        zebra
        giraffe}
    accessory 飾品 #5
        {backpack 背包
        umbrella 雨傘
        handbag 手提包
        tie 領(lǐng)帶
        suitcase 手提箱
        }
    sports  #10
        {frisbee
        skis
        snowboard
        sports ball
        kite
        baseball bat
        baseball glove
        skateboard
        surfboard
        tennis racket
        }
    kitchen  #7
        {bottle
        wine glass
        cup
        fork
        knife
        spoon
        bowl
        }
    food  #10
        {banana
        apple
        sandwich
        orange
        broccoli
        carrot
        hot dog
        pizza
        donut
        cake
        }
    furniture 家具 #6
        {chair
        couch
        potted plant
        bed
        dining table
        toilet
        }
    electronic 電子產(chǎn)品 #6
        {tv
        laptop
        mouse
        remote
        keyboard
        cell phone
        }
    appliance 家用電器 #5
        {microwave
        oven
        toaster
        sink
        refrigerator
        }
    indoor  #7
        {book
        clock
        vase
        scissors
        teddy bear
        hair drier
        toothbrush
        }

coco_id_name_map={1: 'person', 2: 'bicycle', 3: 'car', 4: 'motorcycle', 5: 'airplane',
                   6: 'bus', 7: 'train', 8: 'truck', 9: 'boat', 10: 'traffic light',
                   11: 'fire hydrant', 13: 'stop sign', 14: 'parking meter', 15: 'bench',
                   16: 'bird', 17: 'cat', 18: 'dog', 19: 'horse', 20: 'sheep', 21: 'cow',
                   22: 'elephant', 23: 'bear', 24: 'zebra', 25: 'giraffe', 27: 'backpack',
                   28: 'umbrella', 31: 'handbag', 32: 'tie', 33: 'suitcase', 34: 'frisbee',
                   35: 'skis', 36: 'snowboard', 37: 'sports ball', 38: 'kite', 39: 'baseball bat',
                   40: 'baseball glove', 41: 'skateboard', 42: 'surfboard', 43: 'tennis racket',
                   44: 'bottle', 46: 'wine glass', 47: 'cup', 48: 'fork', 49: 'knife', 50: 'spoon',
                   51: 'bowl', 52: 'banana', 53: 'apple', 54: 'sandwich', 55: 'orange',
                   56: 'broccoli', 57: 'carrot', 58: 'hot dog', 59: 'pizza', 60: 'donut',
                   61: 'cake', 62: 'chair', 63: 'couch', 64: 'potted plant', 65: 'bed', 67: 'dining table',
                   70: 'toilet', 72: 'tv', 73: 'laptop', 74: 'mouse', 75: 'remote', 76: 'keyboard',
                   77: 'cell phone', 78: 'microwave', 79: 'oven', 80: 'toaster', 81: 'sink',
                   82: 'refrigerator', 84: 'book', 85: 'clock', 86: 'vase', 87: 'scissors',
                   88: 'teddy bear', 89: 'hair drier', 90: 'toothbrush'}

COCO2017數(shù)據(jù)集類別匯總

包含了超過80個(gè)物體類別，分別為：['background = 0','person=1', 'bicycle=2', 'car=3', 'motorcycle=4', 'airplane=5', 'bus=6', 'train=7', 'truck=8', 'boat=9', 'traffic light=10', 'fire hydrant=11', 'stop sign=13', 'parking meter=14', 'bench=15', 'bird=16', 'cat=17', 'dog=18', 'horse=19', 'sheep=20', 'cow=21', 'elephant=22', 'bear=23', 'zebra=24', 'giraffe=25', 'backpack=27', 'umbrella=28', 'handbag=31', 'tie=32', 'suitcase=33', 'frisbee=34', 'skis=35', 'snowboard=36', 'sports ball=37', 'kite=38', 'baseball bat=39', 'baseball glove=40', 'skateboard=41', 'surfboard=42', 'tennis racket=43', 'bottle=44', 'wine glass=46', 'cup=47', 'fork=48', 'knife=49', 'spoon=50', 'bowl=51', 'banana=52', 'apple=53', 'sandwich=54', 'orange=55', 'broccoli=56', 'carrot=57', 'hot dog=58', 'pizza=59', 'donut=60', 'cake=61', 'chair=62', 'couch=63', 'potted plant=64', 'bed=65', 'dining table=67', 'toilet=70', 'tv=72', 'laptop=73', 'mouse=74', 'remote=75', 'keyboard=76', 'cell phone=77', 'microwave=78', 'oven=79', 'toaster=80', 'sink=81', 'refrigerator=82', 'book=84', 'clock=85', 'vase=86', 'scissors=87', 'teddy bear=88', 'hair drier=89', 'toothbrush=90']胜蛉。

91個(gè)填充類別挠进，分別為['banner=92', 'blanket=93', 'branch=94', 'bridge=95', 'building-other=96', 'bush=97', 'cabinet=98', 'cage=99', 'cardboard=100', 'carpet=101', 'ceiling-other=102', 'ceiling-tile=103', 'cloth=104', 'clothes=105', 'clouds=106', 'counter=107', 'cupboard=108', 'curtain=109', 'desk-stuff=110', 'dirt=111', 'door-stuff=112', 'fence=113', 'floor-marble=114', 'floor-other=115', 'floor-stone=116', 'floor-tile=117', 'floor-wood=118', 'flower=119', 'fog=120', 'food-other=121', 'fruit=122', 'furniture-other=123', 'grass=124', 'gravel=125', 'ground-other=126', 'hill=127', 'house=128', 'leaves=129', 'light=130', 'mat=131', 'metal=132', 'mirror-stuff=133', 'moss=134', 'mountain=135', 'mud=136', 'napkin=137', 'net=138', 'paper=139', 'pavement=140', 'pillow=141', 'plant-other=142', 'plastic=143', 'platform=144', 'playingfield=145', 'railing=146', 'railroad=147', 'river=148', 'road=149', 'rock=150', 'roof=151', 'rug=152', 'salad=153', 'sand=154', 'sea=155', 'shelf=156', 'sky-other=157', 'skyscraper=158', 'snow=159', 'solid-other=160', 'stairs=161', 'stone=162', 'straw=163', 'structural-other=164', 'table=165', 'tent=166', 'textile-other=167', 'towel=168', 'tree=169', 'vegetable=170', 'wall-brick=171', 'wall-concrete=172', 'wall-other=173', 'wall-panel=174', 'wall-stone=175', 'wall-tile=176', 'wall-wood=177', 'water-other=178', 'waterdrops=179', 'window-blind=180', 'window-other=181', 'wood=182', 'other=183']。提供了118287張訓(xùn)練圖片誊册，5000張驗(yàn)證圖片领突，以及超過40670張測(cè)試圖片。由于其規(guī)模巨大案怯，目前已非常常用君旦，對(duì)領(lǐng)域發(fā)展很重要。實(shí)際上，該競(jìng)賽的結(jié)果每年都會(huì)在ECCV的研討會(huì)上與ImageNet數(shù)據(jù)集的結(jié)果一起公布金砍。它有如下特點(diǎn)：
1）Object segmentation：物體分割
2）Recognition in context ：上下文識(shí)別
3）Superpixel stuff segmentation：超分辨率的實(shí)物分割
4）330K images (>200K labeled)：33萬張圖片（超過20萬有標(biāo)記）
5）1.5 million object instances：150萬個(gè)物體實(shí)例
6）80 object categories：80個(gè)物體類別
9）91 stuff categories ：91個(gè)stuff類別
10）5 captions per image：每張圖像5個(gè)標(biāo)題
11）250,000 people with keypoints：25萬張帶關(guān)節(jié)點(diǎn)的人物圖片

COCO數(shù)據(jù)集對(duì)于圖像的標(biāo)注信息不僅有類別局蚀、位置信息，還有對(duì)圖像的語義文本描述恕稠，COCO數(shù)據(jù)集的開源使得近兩三年來圖像分割語義理解取得了巨大的進(jìn)展至会，也幾乎成為了圖像語義理解算法性能評(píng)價(jià)的“標(biāo)準(zhǔn)”數(shù)據(jù)集。詳細(xì)介紹參考谱俭。注意COCO用于語義分割的API要從這里下載：https://github.com/nightrome/cocostuffapi

代碼：獲取COCO caption 每張圖片有5句文本描述

from pycocotools.coco import COCO
import numpy as np
import skimage.io as io
import matplotlib.pyplot as plt
import pylab
pylab.rcParams['figure.figsize'] = (8.0, 10.0)
dataDir='./coco2017'
dataType='val2017'  # train2017
# initialize COCO api for caption annotations\n",
annFile = '{}/annotations/captions_{}.json'.format(dataDir,dataType)
coco=COCO(annFile)

coco_caps=COCO(annFile)
imgIdsall = coco_caps.getImgIds()
print(imgIdsall)
print(len(imgIdsall))

for i in imgIdsall:

  imgIds = coco.getImgIds(imgIds = [i])
  img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]
  print(img)

  str = img['file_name']
  str1 = str[:-4]
  print(str1)

  path = './val2017/'+str1+'.txt'  # train2017

  with open (path,'w') as f:

  # load and display caption annotations\n",
    annIds = coco.getAnnIds(imgIds=img['id'])
    anns = coco.loadAnns(annIds)
    for ann in anns:
      print(ann['caption'])
      f.write(ann['caption']+'\n')
      print('    ')
  coco.showAnns(anns)

代碼：從指定文本中，讀取文件名宵蛀，然后總指定路徑將文件復(fù)制到指定文件夾中

# -*- coding: utf-8 -*-   
import time     
import os  
import shutil
 
def re_mycopyfile(srcfile,dstfile,num):
    #name_long=16
    l=len(str(num))
    zero='00000000'
    newname = srcfile[-16:-4]
    if not os.path.isfile(srcfile):
        print "%s not exist!"%(srcfile)
    else:
        #fpath,fname=os.path.split(dstfile)    #分離文件名和路徑
        if not os.path.exists(dstfile):
            os.makedirs(dstfile) #創(chuàng)建路徑
        #dstfile=dstfile+zero[:name_long-l-1]+str(num)+'.txt'
        dstfile = dstfile+str(newname)+'.txt'
        print dstfile             
        shutil.copyfile(srcfile,dstfile)      #復(fù)制文件
        print "copy %s -> %s"%(srcfile,dstfile)
 
 
 
if __name__ == '__main__':
    path1="/home/henry/Files/ICCV2019/cocostuffapi/PythonAPI/trainls.txt"  # 待復(fù)制文件列表
    path2="/home/henry/Files/ICCV2019/cocostuffapi/PythonAPI/train2017all/"  # 待復(fù)制文件目錄
    path3="/home/henry/Files/ICCV2019/cocostuffapi/PythonAPI/train2017/"  # 保存目標(biāo)目錄
    path4="/home/henry/Files/ICCV2019/cocostuffapi/PythonAPI/trainnew.txt"  
 
    begin=0
    count=begin
    with open(path1,'r')as f:
        for line in f:
            line=line.split('\n')
            print line[0]
            srcfile = path2+str(line[0])
            print srcfile
            count=count+1
            print count
            dstfile=path3
            re_mycopyfile(srcfile,dstfile,count)
 
    count=begin
    name_long=6
    l=len(str(count+1))
    zero='00000000'

    with open(path1,'r')as f:
        for line in f:
            count=count+1
            out_words=line.split('/')
            #out_words[-1]=zero[:name_long-l-1]+str(count)+'.txt'
            out_words[-1] = zero[:name_long - l - 1] + str(count) + '.txt'
            with open(path4,'a+') as fp:
                fp.write("/".join(out_words)+"\n")

2昆著、VOC2007數(shù)據(jù)集

類別匯總

    aeroplane
    bicycle
    bird
    boat
    bottle
    bus
    car
    cat
    chair
    cow
    diningtable
    dog
    horse
    motorbike
    person
    pottedplant
    sheep
    sofa
    train
    tvmonitor

MSCOCO數(shù)據(jù)集格式轉(zhuǎn)化成VOC數(shù)據(jù)集格式
參考鏈接COCO數(shù)據(jù)集轉(zhuǎn)化成VOC數(shù)據(jù)集格式
首先得到COCO_train.json文件，可以根據(jù)實(shí)際需要的類別進(jìn)行修改

#-*- coding:utf-8-*-
import json
className = {  # 84  total
    1:'person',
    2:'bicycle',
    3:'car',
    4:'motorcycle',
    5:'airplane',
    6:'bus',
    7:'train',
    8:'truck',
    9:'boat',
    10:'traffic light',
    11:'fire hydrant',
    13:'stop sign',
    14:'parking meter',
    15:'bench',
    16:'bird',
    17:'cat',
    18:'dog',
    19:'horse',
    20:'sheep',
    21:'cow',
    22:'elephant',
    23:'bear',
    24:'zebra',
    25:'giraffe',
    27:'backpack',
    28:'umbrella',
    31:'handbag',
    32:'tie',
    33:'suitcase',
    34:'frisbee',
    35:'skis',
    36:'snowboard',
    37:'sports ball',
    38:'kite',
    39:'baseball bat',
    40:'baseball glove',
    41:'skateboard',
    42:'surfboard',
    43:'tennis racket',
    44:'bottle',
    46:'wine glass',
    47:'cup',
    48:'fork',
    49:'knife',
    50:'spoon',
    51:'bowl',
    52:'banana',
    53:'apple',
    54:'sandwich',
    55:'orange',
    56:'broccoli',
    57:'carrot',
    58:'hot dog',
    59:'pizza',
    60:'donut',
    61:'cake',
    62:'chair',
    63:'couch',
    64:'potted plant',
    65:'bed',
    67:'dining table',
    70:'toilet',
    71:'truck',
    72:'tv',
    73:'laptop',
    74:'mouse',
    75:'remote',
    76:'keyboard',
    77:'cell phone',
    78:'microwave',
    79:'oven',
    80:'toaster',
    81:'sink',
    82:'refrigerator',
    84:'book',
    85:'clock',
    86:'vase',
    87:'scissors',
    88:'teddy bear',
    89:'hair drier',
    90:'toothbrush',
}
classNum = [1,2,3,4,5,6,7,8,9,10,
11,12,13,14,15,16,17,18,19,20,
21,22,23,24,25,26,27,28,29,30,
31,32,33,34,35,36,37,38,39,40,
41,42,43,44,45,46,47,48,49,50,
51,52,53,54,55,56,57,58,59,60,
61,62,63,64,65,66,67,68,69,70,
71,72,73,74,75,76,77,78,79,80,
81,82,83,84,85,86,87,88,89,90]

cocojson="/home/ouc/data1/liuhongzhi/AttnGAN/dataset/coco2014/annotations/instances_train2014.json"
def writeNum(Num):
    with open("COCO_train.json", "a+") as f:
        f.write(str(Num))
inputfile = []
inner = {}
cnt = 0
with open(cocojson, "r+") as f:
    allData = json.load(f)
    data =allData["annotations"]
    print(data[1])
    print("read ready")
for i in data:
    if (i['category_id'] in classNum):
        inner = {
            "filename":str(i["image_id"]).zfill(12),
            "name":className[i["category_id"]],
            "bndbox":i["bbox"]
        }
        inputfile.append(inner)
        cnt = cnt + 1
        if cnt%10000 == 0:
           print("id : " + str(cnt))
inputfile = json.dumps(inputfile)
writeNum(inputfile)

其次根據(jù)選取出來的類別中的圖片篩選需要的圖片到指定目錄存放术陶，得到訓(xùn)練集圖片

# -*- coding: utf-8 -*-
# @Time    : 2018/03/09 10:46
# @Author  : SyGoing
# @Site    :
# @File    : getimagesbyID.py
# @Software: PyCharm
import json
import os
import cv2
#from utils.timer import Timer

nameStr = []
with open("COCO_train.json", "r+") as f:
    data = json.load(f)
    print("read ready")
for i in data:
    imgName = "COCO_train2014_"+ str(i["filename"]) + ".jpg"
    nameStr.append(imgName)
nameStr = set(nameStr)
print(nameStr)
print(len(nameStr))

#t_total = Timer()
#total_time = t_total.toc()
#wait_time = max(int(60 - total_time * 1000), 1)
#cv2.waitKey(0)

path = "/home/ouc/data1/liuhongzhi/AttnGAN/dataset/coco2014/images/train2014/"
savePath="/home/ouc/data1/liuhongzhi/yolo2-pytorch/datasets/COCO/VOC2007/JPEGImages/"
count=0
for file in nameStr:
    print(path+file)
    img=cv2.imread(path+file)
    '''
    print(str(img))
    cv2.imshow('test',img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    '''
    cv2.imwrite(savePath+file,img)
    count=count+1
    print('num: '+count.__str__()+'     '+file)

然后根據(jù)篩選出來的圖片ID生成VOC數(shù)據(jù)集的XML文件到Annotations文件夾

#-*- coding:utf-8-*-

import xml.dom
import xml.dom.minidom
import os
# from PIL import Image
import cv2
import json

# xml文件規(guī)范定義


_IMAGE_PATH = '/home/ouc/data1/liuhongzhi/yolo2-pytorch/datasets/COCO/VOC2007/JPEGImages/'

_INDENT = '' * 4
_NEW_LINE = '\n'
_FOLDER_NODE = 'COCO2014'
_ROOT_NODE = 'annotation'
_DATABASE_NAME = 'LOGODection'
_ANNOTATION = 'COCO2014'
_AUTHOR = 'SyGoing_CSDN'
_SEGMENTED = '0'
_DIFFICULT = '0'
_TRUNCATED = '0'
_POSE = 'Unspecified'

# _IMAGE_COPY_PATH= 'JPEGImages'
_ANNOTATION_SAVE_PATH = '/home/ouc/data1/liuhongzhi/yolo2-pytorch/datasets/COCO/VOC2007/Annotations/'


# _IMAGE_CHANNEL= 3

# 封裝創(chuàng)建節(jié)點(diǎn)的過程
def createElementNode(doc, tag, attr):  #創(chuàng)建一個(gè)元素節(jié)點(diǎn)
    element_node = doc.createElement(tag)

    # 創(chuàng)建一個(gè)文本節(jié)點(diǎn)
    text_node = doc.createTextNode(attr)

    # 將文本節(jié)點(diǎn)作為元素節(jié)點(diǎn)的子節(jié)點(diǎn)
    element_node.appendChild(text_node)

    return element_node


# 封裝添加一個(gè)子節(jié)點(diǎn)
def createChildNode(doc, tag, attr, parent_node):
    child_node = createElementNode(doc,tag, attr)

    parent_node.appendChild(child_node)


# object節(jié)點(diǎn)比較特殊
def createObjectNode(doc, attrs):
    object_node =doc.createElement('object')

    midname=attrs['name']


    #if midname !='person':   # 注釋后可以得到所有類別
    #    midname='car'

    createChildNode(doc, 'name', midname,
                    object_node)

    #createChildNode(doc, 'name',attrs['name'],
    #                object_node)

    createChildNode(doc, 'pose',
                    _POSE, object_node)

    createChildNode(doc, 'truncated',
                    _TRUNCATED,object_node)

    createChildNode(doc, 'difficult',
                    _DIFFICULT,object_node)

    bndbox_node = doc.createElement('bndbox')

    createChildNode(doc, 'xmin',str(int(attrs['bndbox'][0])),
                    bndbox_node)

    createChildNode(doc, 'ymin',str(int(attrs['bndbox'][1])),
                    bndbox_node)

    createChildNode(doc, 'xmax',str(int(attrs['bndbox'][0] + attrs['bndbox'][2])),
                    bndbox_node)

    createChildNode(doc, 'ymax',str(int(attrs['bndbox'][1] + attrs['bndbox'][3])),
                    bndbox_node)

    object_node.appendChild(bndbox_node)

    return object_node


# 將documentElement寫入XML文件
def writeXMLFile(doc, filename):
    tmpfile = open('tmp.xml', 'w')

    doc.writexml(tmpfile, addindent='' *4, newl='\n', encoding='utf-8')


    tmpfile.close()

    # 刪除第一行默認(rèn)添加的標(biāo)記

    fin = open('tmp.xml')
    # print(filename)
    fout = open(filename, 'w')
    # print(os.path.dirname(fout))

    lines = fin.readlines()

    for line in lines[1:]:

        if line.split():
            fout.writelines(line)

            # new_lines =''.join(lines[1:])

        # fout.write(new_lines)

    fin.close()

    fout.close()


if __name__ == "__main__":
    ##讀取圖片列表
    img_path ="/home/ouc/data1/liuhongzhi/yolo2-pytorch/datasets/COCO/VOC2007/JPEGImages/"
    fileList = os.listdir(img_path)
    if fileList == 0:
        os._exit(-1)

    with open("COCO_train.json", "r") as f:
        ann_data = json.load(f)

    current_dirpath =os.path.dirname(os.path.abspath('__file__'))

    if not os.path.exists(_ANNOTATION_SAVE_PATH):
        os.mkdir(_ANNOTATION_SAVE_PATH)

        # if not os.path.exists(_IMAGE_COPY_PATH):
    #    os.mkdir(_IMAGE_COPY_PATH)

    for imageName in fileList:

        saveName =imageName.strip(".jpg")
        print(saveName)
        # pos =fileList[xText].rfind(".")
        # textName =fileList[xText][:pos]

        # ouput_file = open(_TXT_PATH +'/' + fileList[xText])
        # ouput_file =open(_TXT_PATH)

        # lines = ouput_file.readlines()

        xml_file_name =os.path.join(_ANNOTATION_SAVE_PATH, (saveName + '.xml'))
        # withopen(xml_file_name,"w") as f:
        #     pass

        img =cv2.imread(os.path.join(img_path, imageName))
        print(os.path.join(img_path,imageName))
        # cv2.imshow(img)
        height, width, channel =img.shape
        print(height, width, channel)

        my_dom = xml.dom.getDOMImplementation()

        doc = my_dom.createDocument(None,_ROOT_NODE, None)

        # 獲得根節(jié)點(diǎn)
        root_node = doc.documentElement

        # folder節(jié)點(diǎn)

        createChildNode(doc, 'folder',_FOLDER_NODE, root_node)

        # filename節(jié)點(diǎn)

        createChildNode(doc, 'filename',saveName + '.jpg', root_node)

        # source節(jié)點(diǎn)

        source_node =doc.createElement('source')

        # source的子節(jié)點(diǎn)

        createChildNode(doc, 'database',_DATABASE_NAME, source_node)

        createChildNode(doc, 'annotation',_ANNOTATION, source_node)

        createChildNode(doc, 'image','flickr', source_node)

        createChildNode(doc, 'flickrid','NULL', source_node)

        root_node.appendChild(source_node)

        # owner節(jié)點(diǎn)

        owner_node = doc.createElement('owner')

        # owner的子節(jié)點(diǎn)

        createChildNode(doc, 'flickrid','NULL', owner_node)

        createChildNode(doc, 'name',_AUTHOR, owner_node)

        root_node.appendChild(owner_node)

        # size節(jié)點(diǎn)

        size_node =doc.createElement('size')

        createChildNode(doc, 'width',str(width), size_node)

        createChildNode(doc, 'height',str(height), size_node)

        createChildNode(doc, 'depth',str(channel), size_node)

        root_node.appendChild(size_node)

        # segmented節(jié)點(diǎn)

        createChildNode(doc, 'segmented',_SEGMENTED, root_node)

        for ann in ann_data:
            imgName ="COCO_train2014_" + str(ann["filename"])
            cname=saveName;
            if (saveName == imgName ):
                # object節(jié)點(diǎn)
                object_node =createObjectNode(doc, ann)
                root_node.appendChild(object_node)

            else:
                continue

                # 構(gòu)建XML文件名稱

        print(xml_file_name)

        # 創(chuàng)建XML文件

        # createXMLFile(attrs, width,height, xml_file_name)

        # # 寫入文件
        #
        writeXMLFile(doc, xml_file_name)

最后得到train.txt文件凑懂，里面是所有訓(xùn)練圖片的名字，需要?jiǎng)h除路徑和后綴梧宫，只保留圖片名接谨。

find ./JPEGImages -name '*.jpg'  > train.txt

3、 Cityscapes數(shù)據(jù)集

Cityscapes數(shù)據(jù)集則是由奔馳主推塘匣，提供無人駕駛環(huán)境下的圖像分割數(shù)據(jù)集脓豪，用于評(píng)估視覺算法在城區(qū)場(chǎng)景語義理解方面的性能。圖像Translation算法常用忌卤，如Pix2pix和CycleGAN扫夜。

Cityscapes包含50個(gè)歐洲城市不同場(chǎng)景、不同背景驰徊、不同季節(jié)的街景的33類標(biāo)注物體笤闯，包括：{'unlabeled'=0 , 'ego vehicle'=1 , 'rectification border'=2 , 'out of roi'= 3 , 'static'=4 , 'dynamic'=5 , 'ground'=6 ,'road'=7 ,'sidewalk'=8 ,parking'=9 ,'rail track'=10 ,'building'=11 ,'wall'=12 ,'fence'=13 , 'guard rail'=14 ,'bridge'=15 ,'tunnel'=16 ,'pole'=17 ,'polegroup'=18 , 'traffic light'=19 ,'traffic sign'=20 , 'vegetation'=21 , 'terrain'=22 ,'sky'=23 , 'person'=24 , 'rider'=25 , 'car'=26 ,'truck'=27 , 'bus'=28 ,'caravan'=29 ,'trailer'=30 ,'train'=31 ,'motorcycle'=32 , 'bicycle'=33 }，但是在這33個(gè)類中棍厂，評(píng)估時(shí)只用到了19個(gè)類別颗味，因此訓(xùn)練時(shí)將33個(gè)類映射為19個(gè)類，評(píng)估時(shí)需要將19個(gè)類又映射回33個(gè)類上傳評(píng)估服務(wù)器牺弹。這個(gè)數(shù)據(jù)需要注冊(cè)賬號(hào)才能下載浦马。

Cityscapes數(shù)據(jù)集共有fine和coarse兩套評(píng)測(cè)標(biāo)準(zhǔn)，前者提供5000張精細(xì)標(biāo)注的圖像张漂，后者提供5000張精細(xì)標(biāo)注外加20000張粗糙標(biāo)注的圖像捐韩，用PASCAL VOC標(biāo)準(zhǔn)的 intersection-over-union （IoU）得分來對(duì)算法性能進(jìn)行評(píng)價(jià)。 5000張精細(xì)標(biāo)注的圖片分為訓(xùn)練集2975張圖片鹃锈，驗(yàn)證集有500張圖片荤胁，而測(cè)試集有1525張圖片，測(cè)試集不對(duì)外公布屎债，需要將預(yù)測(cè)結(jié)果上傳到評(píng)估服務(wù)器才能計(jì)算mIoU值仅政。

最后編輯于：2018.12.15 16:22:38

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末垢油，一起剝皮案震驚了整個(gè)濱河市，隨后出現(xiàn)的幾起案子圆丹，更是在濱河造成了極大的恐慌滩愁，老刑警劉巖，帶你破解...
沈念sama閱讀 217,657評(píng)論 6贊 505
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件辫封，死亡現(xiàn)場(chǎng)離奇詭異硝枉，居然都是意外死亡，警方通過查閱死者的電腦和手機(jī)倦微，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 92,889評(píng)論 3贊 394
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門妻味，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人欣福，你說我怎么就攤上這事责球。” “怎么了拓劝？”我有些...
開封第一講書人閱讀 164,057評(píng)論 0贊 354
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵雏逾，是天一觀的道長(zhǎng)。經(jīng)常有香客問我郑临，道長(zhǎng)栖博，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 58,509評(píng)論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任厢洞，我火速辦了婚禮笛匙，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘犀变。我一直安慰自己妹孙，他們只是感情好，可當(dāng)我...
茶點(diǎn)故事閱讀 67,562評(píng)論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布获枝。她就那樣靜靜地躺著蠢正，像睡著了一般。火紅的嫁衣襯著肌膚如雪省店。梳的紋絲不亂的頭發(fā)上嚣崭，一...
開封第一講書人閱讀 51,443評(píng)論 1贊 302
城市分裂傳說
那天，我揣著相機(jī)與錄音懦傍，去河邊找鬼雹舀。笑死，一個(gè)胖子當(dāng)著我的面吹牛粗俱，可吹牛的內(nèi)容都是我干的说榆。我是一名探鬼主播，決...
沈念sama閱讀 40,251評(píng)論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長(zhǎng)吁一口氣：“原來是場(chǎng)噩夢(mèng)啊……” “哼签财！你這毒婦竟也來了串慰？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 39,129評(píng)論 0贊 276
萬榮殺人案實(shí)錄
序言：老撾萬榮一對(duì)情侶失蹤唱蒸，失蹤者是張志新（化名）和其女友劉穎邦鲫，沒想到半個(gè)月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體神汹，經(jīng)...
沈念sama閱讀 45,561評(píng)論 1贊 314
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡庆捺，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 37,779評(píng)論 3贊 335
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了屁魏。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片滔以。...
茶點(diǎn)故事閱讀 39,902評(píng)論 1贊 348
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡，死狀恐怖蚁堤，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情但狭，我是刑警寧澤披诗，帶...
沈念sama閱讀 35,621評(píng)論 5贊 345
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站立磁，受9級(jí)特大地震影響呈队，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜唱歧，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,220評(píng)論 3贊 328
男人毒藥：我在死后第九天來索命
文/蒙蒙一宪摧、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧颅崩，春花似錦几于、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,838評(píng)論 0贊 22
一樁弒父案沿彭，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至尖滚，卻和暖如春喉刘，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背漆弄。一陣腳步聲響...
開封第一講書人閱讀 32,971評(píng)論 1贊 269
情欲美人皮
我被黑心中介騙來泰國打工睦裳，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留，地道東北人撼唾。一個(gè)月前我還...
沈念sama閱讀 48,025評(píng)論 2贊 370
代替公主和親
正文我出身青樓廉邑，卻偏偏與公主長(zhǎng)得像，于是被迫代替她去往敵國和親。傳聞我的和親對(duì)象是個(gè)殘疾皇子鬓催，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 44,843評(píng)論 2贊 354