三月week4文獻(xiàn)閱讀4:Biological Databases for Hematology Research
血液學(xué)研究生物數(shù)據(jù)庫(kù)
Abstract
With the advances of genome-wide sequencing technologies and bioinformatics approaches, a large number of datasets of normal and malignant erythropoiesis have been generated and made public to researchers around the world.
隨著全基因組測(cè)序技術(shù)和生物信息學(xué)方法的進(jìn)步纱意,大量的正常和惡性紅細(xì)胞生成數(shù)據(jù)集已經(jīng)產(chǎn)生并向世界各地的研究人員公開(kāi)。
Collection and integration of these datasets greatly facilitate basic research and clinical diagnosis and treatment of blood disorders.
這些數(shù)據(jù)集的收集和集成極大地促進(jìn)了血液疾病的基礎(chǔ)研究和臨床診斷與治療。
Here we provide a brief introduction of the most popular omics data resources of normal and malignant hematopoiesis, including some integrated web tools, to help users get better equipped to perform common analyses.
在這里吗蚌,我們簡(jiǎn)要介紹了最流行的正常和惡性造血組學(xué)數(shù)據(jù)資源搂捧,包括一些集成的web工具扮宠,以幫助用戶更好地進(jìn)行常見(jiàn)的分析抛虫。
We hope this review will promote the awareness and facilitate the usage of public database resources in the hematology research.
我們希望這篇綜述能提高對(duì)血液學(xué)研究中公共數(shù)據(jù)庫(kù)資源的認(rèn)識(shí)和利用松靡。
KEYWORDS
Hematology;Hematological diseases;Omics data resources;Database;Bioinformatics
血液學(xué);血液學(xué)疾病;組學(xué)數(shù)據(jù)資源;數(shù)據(jù)庫(kù);生物信息學(xué)
Introduction
Disorders of blood system lead to different kinds of hematological diseases in millions of people every year globally.
血液系統(tǒng)紊亂每年導(dǎo)致全球數(shù)百萬(wàn)人罹患不同種類的血液病。
Blood cells consist of three types of cells, namely erythrocytes (red blood cells, RBCs), leukocytes (white blood cells), and thrombocytes (platelets),all of which are differentiated and developed from hematopoietic stem cells (HSCs).
血細(xì)胞由紅細(xì)胞(紅細(xì)胞建椰,紅細(xì)胞)雕欺、白細(xì)胞(白細(xì)胞)和血小板(血小板)三種類型的細(xì)胞組成,它們都是由造血干細(xì)胞(HSCs)分化和發(fā)育而來(lái)的棉姐。
Erythropoiesis normally produces functional RBCs [1], whereas erroneous erythropoiesis would lead to anemia, leukemia, and other blood diseases [2].
紅細(xì)胞生成通常產(chǎn)生功能性紅細(xì)胞[1]阅茶,而錯(cuò)誤的紅細(xì)胞生成會(huì)導(dǎo)致貧血、白血病和其他血液疾病[2]谅海。
In particular, single-cell sequencing technology makes it feasible to trace the HSC specification, cell fate decision, and differentiation into various cell types at single-cell resolution [3,4].
特別是脸哀,單細(xì)胞測(cè)序技術(shù)使得在單細(xì)胞分辨率下追蹤HSC分化、細(xì)胞命運(yùn)決定以及分化成各種細(xì)胞類型成為可能[3,4]扭吁。
In addition, high-throughput sequencing also allows genome-wide analysis of transcription factor binding and histone modifications by chromatin immunoprecipitation sequencing (ChIP-seq) [5], identification of open regions of chromatin by DNase-Seq [5], as well as transcriptomic expression profiles by RNA-Seq [5].
此外撞蜂,高通量測(cè)序還允許染色質(zhì)免疫沉淀測(cè)序(ChIP-seq)[5]對(duì)轉(zhuǎn)錄因子結(jié)合和組蛋白修飾進(jìn)行全基因組分析,dnas -seq[5]對(duì)染色質(zhì)開(kāi)放區(qū)域進(jìn)行識(shí)別侥袜,RNA-Seq[5]對(duì)轉(zhuǎn)錄組表達(dá)譜進(jìn)行分析蝌诡。
Deeper understanding of the hematological processes of mammals has been driven by the development of these technologies [6].
這些技術(shù)的發(fā)展推動(dòng)了對(duì)哺乳動(dòng)物血液學(xué)過(guò)程的深入了解。
Large organizations, such as the National Center for Biotechnology Information (NCBI), and projects collaborated by international research groups, for example the Encyclopedia of DNA Elements (ENCODE), and a variety of individual laboratories have produced and released many genome-wide datasets to public [7].
大型組織枫吧,如國(guó)家生物技術(shù)信息中心(NCBI)浦旱,以及由國(guó)際研究小組合作的項(xiàng)目,例如DNA元素百科全書(shū)(ENCODE)九杂,以及各種單獨(dú)的實(shí)驗(yàn)室颁湖,已經(jīng)制作并向公共[7]發(fā)布了許多全基因組的數(shù)據(jù)集。
Thanks to the increasingly deeper interpretation of the human genome and the development of bioinformatics databases, we have now appreciated the human erythropoiesis more.
隨著人類基因組解釋的日益深入和生物信息學(xué)數(shù)據(jù)庫(kù)的發(fā)展例隆,我們對(duì)人類紅細(xì)胞生成有了更多的認(rèn)識(shí)甥捺。
Here we collect the most popular omics data resources of normal and malignant hematopoiesis (Table 1). These data components and some integrated web tools for common analyses are introduced in this review
在這里,我們收集了最流行的正常和惡性造血組學(xué)數(shù)據(jù)資源(表1)镀层。本文介紹了這些數(shù)據(jù)組件和一些用于常見(jiàn)分析的集成web工具.
Table 1 Main biological databases for hematology research.
Name | Weblink | Main features | Cell type | Data type | Refs. |
---|---|---|---|---|---|
European LeukemiaNet | http://www.leukemia-net.org/content/home/index_eng.html | Providing physicians and patients research information about diagnosis, treatment, and ongoing clinical trials, as well as further information about leukemia為醫(yī)生和患者提供關(guān)于診斷镰禾、治療和正在進(jìn)行的臨床試驗(yàn)的研究信息,以及關(guān)于白血病的進(jìn)一步信息 | Clinical data of leukemia patient | Clinical data | [8] |
Red Cell Membrane Disorder Mutations Database紅細(xì)胞膜異常突變數(shù)據(jù)庫(kù) | http://research.nhgri.nih.gov/RBCmembrane | Grouping all mutation genes occurring in single or more kinds of inherited disorders of the erythrocyte membrane associated with hemolytic anemia將發(fā)生于溶血性貧血的單個(gè)或多個(gè)遺傳性疾病的所有突變基因分組 | RBCs of hereditary spherocytosis, hereditary elliptocytosis, and hereditary pyropoikilocytosis patients遺傳性球形紅細(xì)胞增多癥唱逢、遺傳性橢圓形紅細(xì)胞增多癥和遺傳性焦紅細(xì)胞增多癥患者的紅細(xì)胞計(jì)數(shù) | Mutation information of related genes | [9] |
dbRBC | http://www.ncbi.nlm.nih.gov/projects/gv/rbc/main.fcgi?cmd=nit | Providing DNA and clinical data related to the human RBCs, integrated with BGMUT database documenting variations in genes that encode antigens for human blood groups提供與人類紅細(xì)胞相關(guān)的DNA和臨床數(shù)據(jù)吴侦,并與BGMUT數(shù)據(jù)庫(kù)集成,該數(shù)據(jù)庫(kù)記錄了編碼人類血型抗原的基因的變異 | Human RBCs人類紅細(xì)胞表面 | DNA and clinical data | [10] |
CODEX | http://codex.stemcells.cam.ac.uk | Containing a subunit database HAEMCODE specialized for grouping NGS data of human and mouse hematopoietic cell experiments包含一個(gè)亞單位數(shù)據(jù)庫(kù)血細(xì)胞HAEMCODE專門(mén)為分組的NGS數(shù)據(jù)的人和小鼠造血細(xì)胞實(shí)驗(yàn) | Human and mouse hematopoietic cells人和小鼠造血細(xì)胞 | NGS data | [11] |
ErythronDB | http://www.cbil.upenn.edu/ErythronDB | Providing expression profile of murine primitive and definitive erythroid cells, and supporting gene searching with annotation, differential expression, transcriptional regulation, etc.提供小鼠原始和確定的紅細(xì)胞表達(dá)譜坞古,并通過(guò)注釋备韧、差異表達(dá)、轉(zhuǎn)錄調(diào)控等支持基因搜索绸贡。 | Murine primitive and definitive erythroid cells | Gene expression data | [12] |
Hembase | http://hembase.niddk.nih.gov/ | Integrating sequencing data of ESTs of human erythroid cells, differentiated erythrocytes, and mature RBCs整合人紅細(xì)胞盯蝴、分化紅細(xì)胞和成熟紅細(xì)胞est測(cè)序數(shù)據(jù) | Human erythroid cells,differentiated erythrocytes, and | ||
mature RBCs | EST data | [1] | |||
BloodSpot | http://www.bloodspot.eu | Providing gene expression profiles of healthy and malignant hematopoiesis in human or mice, encompassing more than 5000 samples in total提供人類或小鼠健康和惡性造血的基因表達(dá)譜,共包含5000多個(gè)樣本 | Human or mouse hematopoietic cells | Oligonucleotide microarray chip data and RNA-seq data寡核苷酸芯片數(shù)據(jù)和RNA-seq數(shù)據(jù) | [15] |
BloodChIP | http://www.med.unsw.edu.au/CRCWeb.nsf/page/BloodChIP | Exploring and visualizing TF sites in human CD34+ and other normal and leukemic cells based on TF ChIP-seq data基于TF ChIP-seq數(shù)據(jù)听怕,探索和可視化人CD34+及其他正常和白血病細(xì)胞中的TF位點(diǎn) | Human CD34+ and leukemic cells人類CD34+和白血病細(xì)胞 | Gene expression data,histone ChIP-seq data,DNase-seq data, and digital genomic footprinting data基因表達(dá)數(shù)據(jù)捧挺,組蛋白芯片seq數(shù)據(jù),dnas -seq數(shù)據(jù)尿瞭,數(shù)字基因組印跡數(shù)據(jù) | [17] |
Leukemia Gene Atlas | http://www.leukemia-gene-atlas.org/LGAtlas | Integrating datasets from more than 5800 leukemia and hematopoiesis samples sequenced by microarray, DNA methylation, SNP, and high-throughput sequencing整合來(lái)自5800多個(gè)白血病和造血樣本的數(shù)據(jù)集闽烙,通過(guò)微陣列、DNA甲基化声搁、SNP和高通量測(cè)序進(jìn)行測(cè)序 | Clinical leukemia samples | Microarray, DNA methylation, and SNP data | [19] |
DBA mutation database | http://www.dbagenes.unito.it | Integrating information on DBA mutation genes and changes of DNA, RNA, protein, and the frequency of the mutation整合DBA突變基因和DNA黑竞、RNA、蛋白質(zhì)變化以及突變頻率的信息 | Blood cells of DBA | General information on variants of all DBA-related genes所有dba相關(guān)基因變異的一般信息 | [23,14] |
Note: ErythronDB, the Erythron Database; ChIP, chromatin immunoprecipitation; EST, expressed sequence tag; DBA, Diamond-Blackfan anemia; BGMUT, Blood Group Antigen Gene Mutation
Database; NGS, next-generation sequencing; RBC, red blood cell; SNP, single nucleotide polymorphism; TF, transcription factor.
注:ErythronDB, Erythron數(shù)據(jù)庫(kù);ChIP,染色質(zhì)免疫沉淀反應(yīng);EST疏旨,表示序列標(biāo)記;DBA, Diamond-Blackfan貧血;BGMUT很魂,血型抗原基因突變數(shù)據(jù)庫(kù);NGS,下一代測(cè)序;RBC:紅細(xì)胞;SNP:單核苷酸多態(tài)性;TF,轉(zhuǎn)錄因子。
European LeukemiaNet
Leukemia is a cancer of white blood cells with high incidence among all ages.
白血病是一種高發(fā)于各個(gè)年齡段的白細(xì)胞癌檐涝。
To centralize the fragmented information of European leukemia, the European LeukemiaNet (ELN) was founded by the 6th Framework Program of the European Community in 2004 [8].
為了集中歐洲白血病的碎片化信息遏匆,歐洲白血病網(wǎng)絡(luò)(ELN)于2004年由歐洲共同體第六屆框架計(jì)劃[8]建立。
The website with friendly user interface delivers information about ongoing clinical trials to physicians and patients, as well as further information regarding leukemia research, such as via publishing study protocols.
該網(wǎng)站具有友好的用戶界面谁榜,為醫(yī)生和患者提供正在進(jìn)行的臨床試驗(yàn)的信息幅聘,以及關(guān)于白血病研究的進(jìn)一步信息,如通過(guò)發(fā)布研究協(xié)議窃植。
Meanwhile, ELN shares knowledge about study design and monitoring, as well as data management and analysis, and pushes forward the discussion on leukemia within Europe (http://www.leukemia-net.org/content/home/index_eng.html).
與此同時(shí)帝蒿,ELN分享了關(guān)于研究設(shè)計(jì)和監(jiān)測(cè)、數(shù)據(jù)管理和分析的知識(shí)巷怜,并推動(dòng)了關(guān)于歐洲白血病的討論(http://www.leukemia-net.org/content/home/index_eng.html)葛超。
As many as 17 work packages work separately on information integration about research, diagnosis, and treatment of leukemia.
多達(dá)17個(gè)工作包分別致力于白血病研究、診斷和治療的信息集成延塑。
Furthermore, ELN also provides information for patients and physicians to better understand the leukemia, the diagnostic methods, and different therapies available.
此外巩掺,ELN還為患者和醫(yī)生提供信息,以更好地了解白血病页畦、診斷方法和不同的治療方法胖替。
Red Cell Membrane Disorder Mutations Database
Red cell membrane inherited disorders involves either altered membrane structural organization or altered membrane transport function [9].
紅細(xì)胞膜遺傳性疾病包括改變細(xì)胞膜結(jié)構(gòu)組織或改變細(xì)胞膜轉(zhuǎn)運(yùn)功能[9]。
The Red Cell Membrane Disorder Mutations Database (http://research.nhgri.nih.gov/RBCmembrane/) contains the mutations associated with three major inherited blood disorders, namely hereditary spherocytosis, elliptocytosis, and pyropoikilocytosis, all of which are caused by the disorder of red cell membrane structural organization.
紅細(xì)胞膜異常突變數(shù)據(jù)庫(kù)(http://research.nhgri.nih.gov/RBCmembrane/)包含了與遺傳性球形紅細(xì)胞增多癥豫缨、橢圓形紅細(xì)胞增多癥和熱解紅細(xì)胞增多癥這三種主要遺傳性血液疾病相關(guān)的突變独令,這些突變都是由紅細(xì)胞膜結(jié)構(gòu)組織紊亂引起的。
The welcome page introduces the gene mutations associated with the three diseases, as well as the term linkages to the Online Mendelian Inheritance in Man (OMIM) database for related genes.
歡迎頁(yè)面介紹了與這三種疾病相關(guān)的基因突變好芭,以及與相關(guān)基因的在線孟德?tīng)栠z傳(OMIM)數(shù)據(jù)庫(kù)相關(guān)的術(shù)語(yǔ)鏈接燃箭。
This database provides detailed information of gene mutations occurring in one or more diseases in its submenu.
該數(shù)據(jù)庫(kù)在其子菜單中提供了發(fā)生在一種或多種疾病中的基因突變的詳細(xì)信息。
In other submenus, users can also obtain additional detailed information about clinical research program and genetic counseling from the National Human Genome Research Institute (NHGRI), the United States.
在其他子菜單中舍败,用戶還可以從美國(guó)國(guó)家人類基因組研究所(NHGRI)獲得更多關(guān)于臨床研究計(jì)劃和遺傳咨詢的詳細(xì)信息招狸。
In addition, the submenus also provide the linkage to the University of California Santa Cruz (UCSC) database for some mutation genes.
此外敬拓,子菜單還提供了與加州大學(xué)圣克魯茲分校(UCSC)數(shù)據(jù)庫(kù)的一些突變基因的鏈接。
At the bottom of the menu, researchers can find the contact information if they have additions, updates, or descriptions of new mutations.
在菜單的底部裙戏,研究人員可以找到聯(lián)系信息乘凸,如果他們有補(bǔ)充,更新累榜,或描述新的突變营勤。
dbRBC
The dbRBC database is one of the NCBI database resources that provides an integrated and freely-accessible platform for DNA sequencing data and clinical data associated with the human RBCs (http://www.ncbi.nlm.nih.gov/projects/gv/rbc/main.fcgi?cmd=init).
dbRBC數(shù)據(jù)庫(kù)是NCBI數(shù)據(jù)庫(kù)資源之一,它為與人類體紅細(xì)胞RBC相關(guān)的DNA測(cè)序數(shù)據(jù)和臨床數(shù)據(jù)提供了一個(gè)集成的壹罚、可自由訪問(wèn)的平臺(tái)(http://www.ncbi.nlm.nih.gov/projects/gv/rbc/main.fcgi?cmd=init).
It integrates the data from the Blood Group Antigen Gene Mutation Database (BGMUT) that records variations in genes encoding antigens for human blood groups from the NCBI [10].
它整合了來(lái)自血型抗原基因突變數(shù)據(jù)庫(kù)(BGMUT)的數(shù)據(jù)葛作,該數(shù)據(jù)庫(kù)記錄了來(lái)自NCBI[10]的人類血型抗原編碼基因的變異
Users could obtain the data from the download menu that directly links to the page of file transfer protocol.
用戶可以從直接鏈接到文件傳輸協(xié)議頁(yè)面的下載菜單中獲取數(shù)據(jù)。
dbRBC homepage also offers the linkage to the parallel resources, such as dbMHC for data related to the human major histocompatibility complex (MHC) and dbLRC for resource available for human leukocyte receptor complex (LRC).
dbRBC主頁(yè)還提供了與并行資源的鏈接猖凛,例如用于人類主要組織相容性復(fù)合體(MHC)的數(shù)據(jù)的dbMHC和用于人類白細(xì)胞受體復(fù)合體(LRC)的資源的dbLRC赂蠢。
These 3 public resources make up the database cluster for routine clinical applications [11], such as the ABO genotyping technology.
這3種公共資源構(gòu)成了常規(guī)臨床應(yīng)用的數(shù)據(jù)庫(kù)集群,如ABO基因分型技術(shù)辨泳。
Some additional practical tools are also provided, such as the Alignment Viewer and Primer Resource.
還提供了一些其他實(shí)用工具客年,如對(duì)Alignment Viewer(齊查看器)和Primer Resource(入門(mén)資源 )。
CODEX
CODEX (http://codex.stemcells.cam.ac.uk/) is a database of mouse and human NGS experiments.
CODEX (http://codex.stemcells.cam.ac.uk/)是一個(gè)關(guān)于小鼠和人類NGS實(shí)驗(yàn)的數(shù)據(jù)庫(kù)漠吻。
The aim of CODEX is to provide an open-resource of NGS experiments processed by uniform procedures.
CODEX 委員會(huì)的目的是提供一種采用統(tǒng)一程序處理的NGS實(shí)驗(yàn)的開(kāi)放資源量瓜。
In this database, metadata of human and mouse samples from hematological experiments are collected and sequencing data are uniformly processed and vetted [12].
該數(shù)據(jù)庫(kù)收集血液學(xué)實(shí)驗(yàn)中人類和小鼠樣本的元數(shù)據(jù),并對(duì)測(cè)序數(shù)據(jù)進(jìn)行統(tǒng)一處理和審核途乃。
CODEX also provides access to processed and curated NGS experiments, including ChIP-seq, RNA-seq, and DNase-seq.
CODEX 還提供了加工和管理NGS實(shí)驗(yàn)的途徑绍傲,包括ChIP-seq、RNA-seq和dnas -seq耍共。
The main data sources of CODEX are NGS repositories, for instance, the Gene Expression Omnibus (GEO) and ArrayExpress.
CODEX的主要數(shù)據(jù)源是NGS知識(shí)庫(kù)烫饼,如基因表達(dá)綜合數(shù)據(jù)庫(kù)(GEO)和ArrayExpress。
Besides, CODEX also provides a private site hosting non-published data.
此外试读,CODEX還提供了一個(gè)托管非公開(kāi)數(shù)據(jù)的私有站點(diǎn)杠纵。
Furthermore, processed datasets can be analyzed online or downloaded.
此外,處理后的數(shù)據(jù)集可以在線分析或下載钩骇。
CODEX now covers data on 133 hematopoietic cells and embryonic stem cells, and 269 factors associated with these cells.
CODEX現(xiàn)在涵蓋了133個(gè)造血細(xì)胞和胚胎干細(xì)胞的數(shù)據(jù)比藻,以及與這些細(xì)胞相關(guān)的269個(gè)因子。
The Erythron Database
The Erythron Database (ErythronDB;http://www.cbil.upenn.edu/ErythronDB) was built to facilitate access to erythroid expression data and the analysis results in murine primitive and definitive erythroid cells [13].
建立Erythron數(shù)據(jù)庫(kù)(ErythronDB;http://www.cbil.upenn.edu/ErythronDB)是為了方便獲取小鼠原始和確定的紅細(xì)胞[13]中的紅細(xì)胞表達(dá)數(shù)據(jù)和分析結(jié)果倘屹。
ErythronDB allows users to identify differentially-expressed genes and custom-made downstream analysis in the strategy module.
ErythronDB允許用戶在策略模塊中識(shí)別差異表達(dá)的基因和定制的下游分析银亲。
Users are also permitted to save and share strategies with other registered users.
用戶還可以與其他注冊(cè)用戶保存和共享策略。
The database integrates global gene expression profile data of primitive, fetal liver definitive, and adult bone marrow definitive erythroid using Affymetrix array for each maturation stage.
該數(shù)據(jù)庫(kù)使用Affymetrix陣列對(duì)每個(gè)成熟階段的原始纽匙、胎兒肝臟和成年骨髓紅細(xì)胞的全球基因表達(dá)譜數(shù)據(jù)進(jìn)行了整合务蝠。
ErythronDB supports complex investigations on expression parameters, as well as the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations.
ErythronDB支持對(duì)表達(dá)參數(shù)的復(fù)雜研究,以及基因本體(GO)和 Kyoto基因和基因組百科全書(shū)(KEGG)注釋烛缔。
To ensure abundant knowledge on mouse genes, ErythronDB displays links to external databases, including the Mouse Genome Informatics (MGI).
為確保對(duì)小鼠基因有豐富的知識(shí)馏段,ErythronDB顯示到外部數(shù)據(jù)庫(kù)的鏈接轩拨,包括小鼠基因組信息學(xué)(MGI)。
Hembase
Hembase (http://hembase.niddk.nih.gov) provides genomebased access to human genes transcribed during erythropoiesis.
Hembase (http://hembase.niddk.nih.gov)提供了對(duì)紅細(xì)胞生成過(guò)程中轉(zhuǎn)錄的人類基因的基于基因組的訪問(wèn)院喜。
By sequencing several thousand expressed sequence tags (ESTs) of human erythroid cells, including progenitor cells, precursor cells, and mature RBCs, the Hembase integrated these data to provide users a friendly browser and the genome portal.
Hembase通過(guò)對(duì)人類紅細(xì)胞(包括祖細(xì)胞亡蓉、前體細(xì)胞和成熟紅細(xì)胞)的數(shù)千個(gè)表達(dá)序列標(biāo)記(est)進(jìn)行測(cè)序,整合了這些數(shù)據(jù)够坐,為用戶提供一個(gè)友好的瀏覽器和基因組門(mén)戶。
To date, the database contained 15,752 entries of ESTs and 380 genes associated with erythropoiesis [1].
到目前為止崖面,數(shù)據(jù)庫(kù)包含了15,752個(gè)紅細(xì)胞和380個(gè)與紅細(xì)胞生成相關(guān)的基因元咙。
Hembase provides cytogenetic band position as well as a unique name as concise annotations for each search entry.
Hembase為每個(gè)搜索條目提供了細(xì)胞遺傳帶位置以及唯一的名稱和簡(jiǎn)潔的注釋。
Users can search by gene name, keywords, or cytogenetic location.
用戶可以通過(guò)基因名稱巫员、關(guān)鍵字或細(xì)胞遺傳學(xué)位置搜索庶香。
All the sequencing information in Hembase can be used without registration, and all ESTs can be downloaded from the NCBI UniGene Library Browser [14].
無(wú)需注冊(cè)即可使用Hembase中的所有排序信息,所有ESTs都可以從NCBI UniGene庫(kù)瀏覽器[14]下載简识。
BloodSpot
BloodSpot (http://www.bloodspot.eu) is a database including gene expression profiles of healthy and malignant hematopoiesis in humans or mice, which had been generated by oligonucleotide microarray chips and RNA sequencing [15].
BloodSpot (http://www.bloodspot.eu)是由寡核苷酸芯片和RNA測(cè)序[15]生成的人類或小鼠健康和惡性造血基因表達(dá)譜數(shù)據(jù)庫(kù)。
This platform is an improvement and expansion of HemaExplorer and encompasses more than 5000 samples in total [16].
這個(gè)平臺(tái)是HemaExplorer的改進(jìn)和擴(kuò)展七扰,包含了超過(guò)5000個(gè)[16]樣本奢赂。
For each query gene or gene signature, BloodSpot provides three concomitant levels of visualization—gene expression, survival plot, and hierarchical tree of samples.
對(duì)于每個(gè)查詢基因或基因簽名,BloodSpot提供了三個(gè)伴隨的可視化級(jí)別——基因表達(dá)颈走、生存圖和樣本的層次樹(shù)膳灶。
Besides, BloodSpot also contains other built-in tools such as exploring the top correlated genes and calculating the student t-test significance between pairs of populations in the default expression plot.
此外,BloodSpot還包含其他內(nèi)置工具立由,如在默認(rèn)表達(dá)圖中探索最相關(guān)的基因轧钓,計(jì)算成對(duì)群體間的student t檢驗(yàn)顯著性。
Another feature of BloodSpot is BloodPool, an assembled and integrated database collecting the results of multiple studies with more than 2000 samples focusing on acute myeloid leukemia (AML).
BloodSpot的另一個(gè)特點(diǎn)是血池锐膜,這是一個(gè)匯集和集成的數(shù)據(jù)庫(kù)毕箍,收集了2000多個(gè)以急性髓系白血病(AML)為重點(diǎn)的多個(gè)研究的結(jié)果。
BloodChIP
The BloodChIP database (http://www.med.unsw.edu.au/CRCWeb.nsf/page/BloodChIP) provides a user-friendly exploration and visualization of transcription factor (TF)binding sites in human CD34+ and leukemia cells produced by TF ChIP-Seq platform [17].
BloodChIP數(shù)據(jù)庫(kù)(http://www.med.unsw.edu.au/CRCWeb.nsf/page/BloodChIP)提供了對(duì)人CD34+和TF ChIP-Seq平臺(tái)[17]產(chǎn)生的白血病細(xì)胞中轉(zhuǎn)錄因子(TF)結(jié)合位點(diǎn)的友好探索和可視化道盏。
Users can enter the keywords about specific gene(s) or genomic region(s) to retrieve TF binding profiles.
用戶可以輸入關(guān)于特定基因或基因組區(qū)域的關(guān)鍵字來(lái)檢索TF結(jié)合譜而柑。
Users can also search all the target genes for a combination of selected TFs or for any selected TFs in specific cell type(s).
用戶還可以搜索所有目標(biāo)基因,以尋找所選TFs的組合或特定細(xì)胞類型的任何所選TFs荷逞。
Currently, BloodChIP covers data on four cell types, i.e.,CD34+ hematopoietic stem and progenitor cells (HSPCs), megakaryocytes, SKNO-1, and K562.
目前牺堰,BloodChIP涵蓋了四種細(xì)胞類型的數(shù)據(jù),即颅围、CD34+造血干細(xì)胞伟葫、巨核細(xì)胞、SKNO-1院促、K562筏养。
To maximize the utility of these data, this database has been integrated with many public data for insights into the transcriptional regulation of query genes, such as gene expression data, histone ChIP-seq data, and DNase-seq data from the Human Epigenome Atlas and ENCODE database [7,18].
為了最大限度地利用這些數(shù)據(jù)斧抱,該數(shù)據(jù)庫(kù)已與許多公共數(shù)據(jù)集成,深入了解查詢基因的轉(zhuǎn)錄調(diào)控渐溶,如基因表達(dá)數(shù)據(jù)辉浦、組蛋白ChIP-seq數(shù)據(jù)、人類表觀基因組圖譜和編碼數(shù)據(jù)庫(kù)中的dnas -seq數(shù)據(jù)等[7,18]茎辐。
Leukemia Gene Atlas
Leukemia Gene Atlas (LGA) database is a public platform integrating diverse genomic data published in the leukemia field [19].
白血病基因圖譜(LGA)數(shù)據(jù)庫(kù)是一個(gè)公共平臺(tái)宪郊,集成了白血病領(lǐng)域發(fā)表的多種基因組數(shù)據(jù)。
The LGA supports comprehensive research, analysis, and browse functions for more than 5800 leukemia and hematopoiesis samples sequenced by multiple platforms, such as microarray, DNA methylation, SNP, and other highthroughput sequencing manners.
LGA支持對(duì)5800多個(gè)白血病和造血樣本進(jìn)行綜合研究拖陆、分析和瀏覽功能弛槐,這些樣本由多個(gè)平臺(tái)測(cè)序,如微陣列依啰、DNA甲基化乎串、SNP等高通量測(cè)序方式。
The database contains information on studies from various aspects, such as prediction of molecular subtypes of leukemia, human hematopoiesis, and TF binding sites imported from the GEO.
該數(shù)據(jù)庫(kù)包含了來(lái)自各個(gè)方面的研究信息速警,如白血病分子亞型的預(yù)測(cè)叹誉、人類造血、從GEO導(dǎo)入的TF結(jié)合位點(diǎn)等闷旧。
LGA also has established quality control procedure to filter out qualified data imported from other datasets.
LGA還建立了質(zhì)量控制程序长豁,過(guò)濾從其他數(shù)據(jù)集導(dǎo)入的合格數(shù)據(jù)。
Results of each study include differentially-expressed genes, GO annotations, copy number alterations, and an extract of the Catalogue of Somatic Mutations in Cancer (COSMIC) database.
每項(xiàng)研究的結(jié)果包括差異表達(dá)的基因忙灼、GO注釋蕉斜、拷貝數(shù)改變,以及癌癥(COSMIC)數(shù)據(jù)庫(kù)中體細(xì)胞突變目錄的摘錄缀棍。
The LGA database is freely accessible at http://www.leukemia-gene-atlas.org/LGAtlas/.
LGA數(shù)據(jù)庫(kù)可在http://www.leukemia-gene-atlas.org/LGAtlas/.宅此。
Diamond-Blackfan anemia mutation database
Diamond-Blackfan anemia (DBA) is a hereditary bone marrow failure syndrome characterized by the marked heterogeneity of clinical symptom, such as anemia, developmental abnormalities, and an increased risk of malignancy [20–22].
Diamond-Blackfan貧血(DBA)是一種遺傳性骨髓衰竭綜合征,臨床癥狀具有明顯的異質(zhì)性爬范,如貧血父腕、發(fā)育異常、惡性腫瘤風(fēng)險(xiǎn)增加等[20-22]青瀑。
The DBA mutation database was built aimed to help researchers and physicians to better understand the mutations found in patients.
DBA突變數(shù)據(jù)庫(kù)的建立是為了幫助研究人員和醫(yī)生更好地了解在患者中發(fā)現(xiàn)的突變璧亮。
This database is based on the Leiden Open Variation Database (LOVD) system (http://www.dbagenes.unito.it).
該數(shù)據(jù)庫(kù)基于萊頓開(kāi)放變異數(shù)據(jù)庫(kù)(LOVD)系統(tǒng)(http://www.dbagenes.unitoit)。
The database comprises of 27 published mutations in RPS11 gene, the main contributor to DBA.
數(shù)據(jù)庫(kù)由27個(gè)已發(fā)表的RPS11基因突變組成斥难,RPS11基因是DBA的主要貢獻(xiàn)者枝嘶。
Each mutation is described in detail with both tables and graphs, including gene information,sequence information, and graphic displays from UCSC [23,24].
每個(gè)突變都用表格和圖表詳細(xì)描述,包括UCSC的基因信息哑诊、序列信息和圖形顯示[23,24]群扶。
The database provides information on changes in DNA, RNA, and protein, as well as the frequency of the mutations via a convenient search interface.
該數(shù)據(jù)庫(kù)通過(guò)一個(gè)方便的搜索界面提供有關(guān)DNA、RNA和蛋白質(zhì)變化以及突變頻率的信息。
Users are welcome to submit mutations after they register as a submitter.
歡迎用戶注冊(cè)為提交者后提交突變竞阐。
Concluding remarks
結(jié)束語(yǔ)
Due to the recent technological advances, a large amount of data for erythrocyte differentiation has been generated, producing valuable resources for understanding pathogenesis.
近年來(lái)缴饭,由于技術(shù)的進(jìn)步,產(chǎn)生了大量的紅細(xì)胞分化數(shù)據(jù)骆莹,為了解其發(fā)病機(jī)制提供了寶貴的資源颗搂。
This review offers a brief introduction of multiple databases in the fields of hematopoiesis and blood diseases (Figure 1), all of which are freely available without any registration.
這篇綜述簡(jiǎn)要介紹了造血和血液疾病領(lǐng)域的多個(gè)數(shù)據(jù)庫(kù)(圖1),所有這些數(shù)據(jù)庫(kù)都是免費(fèi)提供的幕垦,沒(méi)有任何注冊(cè)丢氢。
The majority of databases, namely Red Cell Membrane Disorder Mutations Database, dbRBC,CODEX, ErythronDB, Hembase, BloodSpot, and BloodChIP focus on the normal erythrocyte development in humans and model organisms to provide transcriptomic and genomic data.
大多數(shù)數(shù)據(jù)庫(kù),即紅細(xì)胞膜功能紊亂突變數(shù)據(jù)庫(kù)先改、dbRBC疚察、CODEX、ErythronDB盏道、Hembase稍浆、BloodSpot载碌、BloodChIP等猜嘱,關(guān)注人類和模型生物紅細(xì)胞的正常發(fā)育,提供轉(zhuǎn)錄組和基因組數(shù)據(jù)嫁艇。
On the other hand, ELN and LGA are databases in the field of leukemia with clinical resources,whereas DBA mutation database is specifically designed for DBA.
另一方面朗伶,ELN和LGA是白血病領(lǐng)域具有臨床資源的數(shù)據(jù)庫(kù),而DBA突變數(shù)據(jù)庫(kù)是專門(mén)為DBA設(shè)計(jì)的步咪。
Obviously, despite our efforts on hematopoiesis studies, the sample sizes covered in the databases reviewed in this article are still limited and there is also lack of databases for other blood diseases.
顯然论皆,盡管我們?cè)谠煅芯糠矫孀龀隽伺Γ疚乃C述的數(shù)據(jù)庫(kù)所涵蓋的樣本量仍然有限猾漫,其他血液疾病的數(shù)據(jù)庫(kù)也缺乏点晴。
Fortunately, benefiting from big data programs across the globe, people are getting aware of the importance of biological data to public health, which makes it easier for researchers to obtain data generated from a large number of patients or donors.
幸運(yùn)的是,得益于全球范圍內(nèi)的大數(shù)據(jù)項(xiàng)目悯周,人們開(kāi)始意識(shí)到生物數(shù)據(jù)對(duì)公共健康的重要性粒督,這使得研究人員更容易獲取大量患者或捐贈(zèng)者的數(shù)據(jù)。
With the accumulation of knowledge and research progress,The 10 database mentioned in the current review are classified into 3 categories.
隨著知識(shí)的積累和研究的進(jìn)展禽翼,本綜述中提到的10個(gè)數(shù)據(jù)庫(kù)被分為三類屠橄。
Four databases marked with red petals on the left side of the flower are disease databases, providing biological data of hematopoietic disorders.
在花的左側(cè)用紅色花瓣標(biāo)記的四個(gè)數(shù)據(jù)庫(kù)是疾病數(shù)據(jù)庫(kù),提供造血疾病的生物學(xué)數(shù)據(jù)闰挡。
Another four databases marked with blue petals on the right side of the flower are hematopoiesis database, providing information on normal hematopoietic development.
另外四個(gè)在花的右側(cè)用藍(lán)色花瓣標(biāo)記的數(shù)據(jù)庫(kù)是造血數(shù)據(jù)庫(kù)锐墙,提供正常造血發(fā)育的信息。
The remaining two databases marked in yellow in the center of the flower are integrated databases.
在花的中心用黃色標(biāo)記的其余兩個(gè)數(shù)據(jù)庫(kù)是集成數(shù)據(jù)庫(kù)长酗。
DBA,Diamond Blackfan anemia;ErythronDB, the Erythron Database.
DBA,Diamond Blackfan貧血;ErythronDB, Erythron數(shù)據(jù)庫(kù)溪北。
we are expecting to see a number of databases combined with clinical data available for biologists and clinicians in near future.
我們期望在不久的將來(lái)看到一些數(shù)據(jù)庫(kù)與臨床數(shù)據(jù)相結(jié)合,供生物學(xué)家和臨床醫(yī)生使用。
Competing interests
相互競(jìng)爭(zhēng)的利益
The authors declared that there are no competing interests
作者宣稱沒(méi)有相互競(jìng)爭(zhēng)的利益
Acknowledgments
This study was supported by the National Key Research and Development Program of China (Grant No. 2016YFC0901700), the National High-tech R&D Program of China (863 Program, Grant Nos. 2015AA020101 and 2015AA020108), the National ‘‘12th Five-Year Plan” for Science & Technology Support of China (Grant No. 2013BAI01B09), and the National Natural Science Foundation of China (Grant Nos. 31471115 and 81670109) awarded to XF
本研究支持中國(guó)國(guó)家重點(diǎn)研發(fā)項(xiàng)目(批準(zhǔn)號(hào)2016 yfc0901700),中國(guó)國(guó)家高科技研發(fā)計(jì)劃(863計(jì)劃,批準(zhǔn)號(hào)刻盐。2015 aa020101和2015 aa020108),國(guó)家“十二五計(jì)劃”中國(guó)科學(xué)與技術(shù)支持(批準(zhǔn)號(hào)2013 bai01b09),以及中國(guó)的國(guó)家自然科學(xué)基金(批準(zhǔn)號(hào)31471115和31471115)授予XF
References
文章詳情 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5200935/
方向:
準(zhǔn)備著手一個(gè)紅系數(shù)據(jù)庫(kù)的開(kāi)發(fā)掏膏,看看類似的數(shù)據(jù)庫(kù)都是怎么些什么設(shè)計(jì):
思考問(wèn)題:
數(shù)據(jù)類別:疾病數(shù)據(jù)庫(kù)VS 正常造血發(fā)育
測(cè)序數(shù)據(jù):ChIP-seq、RNA-seq和dnas -seq
知識(shí)介紹:
功能設(shè)計(jì):
目前所有的數(shù)據(jù):
與Hembase數(shù)據(jù)庫(kù)類似敦锌。
正常HSC及其連續(xù)