出于科研需要榔组,開一個帖子來總結(jié)一下自己在閱讀文獻(xiàn)中遇到過的數(shù)據(jù)庫,也方便之后再使用联逻。
分類+檢測數(shù)據(jù)庫
ImageNet
ImageNet,無需多言搓扯,上介紹:
What is ImageNet?
ImageNet is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). In ImageNet, we aim to provide on average 1000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated. In its completion, we hope ImageNet will offer tens of millions of cleanly sorted images for most of the concepts in the WordNet hierarchy.
ImageNet是一個根據(jù)WordNet層級組織起來的數(shù)據(jù)庫。每一個在WordNet上有意義的概念包归,可能是通過一個詞锨推,也可能是通過多個詞組織起來的。都被稱作“同義詞組”公壤。在WordNet上大約有100000個同義詞組(概念)换可,其中8000多個是名詞。在ImageNet上厦幅,我們的目標(biāo)是為每一個概念提供1000個圖像沾鳄。每一個圖像都有質(zhì)量保證和人工標(biāo)注。在完成后确憨,我們希望能夠提供百萬級的分類好的圖片译荞。
與其相關(guān)的競賽是 ILSVRC瓤的。
分類數(shù)據(jù)庫
MNIST
MNIST 是大牛Yan LeCun的工作之一,用來識別手寫數(shù)字吞歼。簡介:
The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.
It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.
MNIST數(shù)據(jù)庫是手寫數(shù)字的數(shù)據(jù)庫(人寫的數(shù)字)圈膏。它包括訓(xùn)練集(60000個實(shí)例),測試集(10000個實(shí)例)浆熔。它是NIST數(shù)據(jù)庫的一個子集本辐。這些數(shù)字大小相同,而且都位于圖像中央医增。
它可以幫助科研人員測試學(xué)習(xí)技術(shù)和模式識別方法。
CIFAR
CIFAR 是多倫多大學(xué)計算機(jī)科學(xué)系維護(hù)的一個數(shù)據(jù)庫老虫,全稱是Canadian Institute for Advanced Research叶骨,都是分類好的圖片,用來測試算法分類的錯誤率的祈匙。既然是多倫多大學(xué)的忽刽,果然……CIFAR有Hinton大神參與維護(hù)。CIFAR又分為CIFAR-10和CIFAR-100夺欲,其實(shí)就是10個類別和100個類別的區(qū)別跪帝。
CIFAR-10包括了60000張32x32的彩色圖片,共分為10類些阅,每一類6000張圖片伞剑。總共有50000個訓(xùn)練圖像和10000個測試圖像市埋。
這個數(shù)據(jù)庫被分為5個訓(xùn)練批次(batch)和1個測試批次黎泣,每個批次10000張圖片。測試批次準(zhǔn)確包括了每個類別各1000張隨機(jī)選擇的圖片缤谎。訓(xùn)練批次包含了隨機(jī)選擇的剩余的圖片抒倚,也就是說,某些訓(xùn)練批次可能包含的某一個類別的圖片會多一些坷澡⊥信唬總共加起來,這五個訓(xùn)練批次共包含每類5000張圖片频敛。
這些分類都是互斥的项郊。沒有重疊,比如說有兩個類是汽車(automobile)和卡車(truck)姻政。汽車包括轎車呆抑,SUV等≈梗卡車只包括大卡車鹊碍。你要問我皮卡怎么算厌殉?答案是兩個類里面都沒有皮卡。
CIFAR-100差不多侈咕,就是類別多了10倍公罕,每一類的圖片的數(shù)量不同。詳細(xì)的需要的時候再去看吧耀销。
YFCC100
YFCC100是雅虎的圖片/視頻分類數(shù)據(jù)庫楼眷。
檢測數(shù)據(jù)庫
PASCAL VOC 2007/2012
Visual Object Classes Challenge 2012 (VOC 2012) 是牛津大學(xué)出品的數(shù)據(jù)庫,用來識別物體熊尉。簡介:
The main goal of this challenge is to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). It is fundamentally a supervised learning learning problem in that a training set of labelled images is provided. The twenty object classes that have been selected are:
Person: person
Animal: bird, cat, cow, dog, horse, sheep
Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
There are three main object recognition competitions: classification, detection, and segmentation, a competition on action classification, and a competition on large scale recognition run by ImageNet. In addition there is a "taster" competition on person layout.
VOC2012的主要目標(biāo)是從真實(shí)場景中識別物體罐柳。它的基本作用是為監(jiān)督學(xué)習(xí)問題提供一個訓(xùn)練集。20個物體類別是:
- 人:人
- 動物:鳥狰住,毛张吉,牛,狗催植,馬肮蛹,羊
- 交通工具: 飛機(jī),自行車创南,傳伦忠,公交,轎車稿辙,摩托車昆码,火車;
- 室內(nèi)物體:瓶子邓深,椅子未桥,餐桌,盆栽植物芥备,沙發(fā)冬耿,電視/顯示器
物體識別主要有三類任務(wù):
- 分類,檢測和分割
- 動作分類
- 大尺度識別(by ImageNet)
- 額外的:人體輪廓
COCO
COCO 是一個新的圖像識別萌壳,分割亦镶,標(biāo)記數(shù)據(jù)庫。這里面的圖像都已經(jīng)預(yù)先分割好了袱瓮,就看你的算法分割的錯誤率低不低了缤骨。與其相關(guān)的競賽是COCO 2016 Detection and Keypoint Challenges
KITTI
KITTI Vision Benchmark Suite,測試自動駕駛 尺借。這個庫里面的圖片都是汽車在行駛過程中在Karlruhe這個城市拍攝的街景绊起,都有標(biāo)簽。比較小燎斩,只有289張訓(xùn)練圖片虱歪。
其中一些道路標(biāo)簽包括:Highway, minor road
分割數(shù)據(jù)庫
CityScapes Dataset
CityScapes dataset 目標(biāo)是城市街景的語義理解(感覺就是城市街景里面的物體識別)蜂绎。特點(diǎn):
Type of annotations
- Semantic
- Instance-wise
- Dense pixel annotations
Complexity
- 30 classes
- See Class Definitions for a list of all classes and have a look at the applied labeling policy.
Diversity
- 50 cities
- Several months (spring, summer, fall)
- Daytime
- Good/medium weather conditions
- Manually selected frames
- Large number of dynamic objects
- Varying scene layout
- Varying background
Volume
- 5?000 annotated images with fine annotations (examples)
- 20?000 annotated images with coarse annotations (examples)