5月week3 文獻(xiàn)閱讀:BloodSpot: a database of gene expression profiles **and transcriptional programs for healthy and **malignant haematopoiesis
ABSTRACT
-
Research on human and murine haematopoiesis has resulted in a vast number of gene-expression data sets that can potentially answer questions regarding normal and aberrant blood formation.
對人類和小鼠造血的研究已經(jīng)產(chǎn)生了大量的基因表達(dá)數(shù)據(jù)集刃泌,這些數(shù)據(jù)集有可能回答關(guān)于正常和異常血液形成的問題。
-
To researchers and clinicians with limited bioinformatics experience, these data have remained available, yet largely in- accessible.
對于生物信息學(xué)經(jīng)驗(yàn)有限的研究人員和臨床醫(yī)生來說窖梁,這些數(shù)據(jù)仍然可用搔谴,但在很大程度上是不可獲取的搓扯。
-
Current databases provide information about gene-expression but fail to answer key questions regarding co-regulation, genetic programs or effect on patient survival.
目前的數(shù)據(jù)庫提供有關(guān)基因表達(dá)的信息夯秃,但未能回答有關(guān)共同調(diào)控、基因程序或?qū)颊呱嬗绊懙年P(guān)鍵問題徽惋。
-
To address these short- comings, we present BloodSpot (www.bloodspot.eu), which includes and greatly extends our previously released database HemaExplorer, a database of gene expression profiles fromFACS sorted healthy and malignant haematopoietic cells.
為了解決這些不足案淋,我們提出了BloodSpot (www.bloodspot.eu),它包括并大大擴(kuò)展了我們之前發(fā)布的數(shù)據(jù)庫HemaExplorer险绘,這是一個(gè)從facs中篩選健康和惡性造血細(xì)胞的基因表達(dá)譜數(shù)據(jù)庫踢京。
-
A revised interactive interface simultaneously provides a plot of gene expression along with a Kaplan–Meier analysis and a hierarchical tree depicting the relationship between different cell types in the database.
修改后的交互界面同時(shí)提供了一個(gè)基因表達(dá)圖、Kaplan-Meier分析和描述數(shù)據(jù)庫中不同細(xì)胞類型之間關(guān)系的層次樹宦棺。
-
The database now includes 23 high-quality curated data sets relevant to normal and malignant blood formation and, in addition, we have assembled and built a unique integrated data set, BloodPool.
該數(shù)據(jù)庫目前包括23個(gè)高質(zhì)量的與正常和惡性血液形成相關(guān)的數(shù)據(jù)集瓣距,此外,我們還組裝并構(gòu)建了一個(gè)獨(dú)特的集成數(shù)據(jù)集BloodPool代咸。
-
Bloodpool contains more than 2000 samples assembled from six independent studies on acute myeloid leukemia.
血池包含了超過2000個(gè)樣本蹈丸,這些樣本來自于6項(xiàng)關(guān)于急性髓系白血病的獨(dú)立研究。
-
Furthermore, we have devised a robust sample integration procedure that allows for sensitive comparison of user-supplied patient samples in a well- defined haematopoietic cellular space.
此外,我們還設(shè)計(jì)了一個(gè)健壯的樣本集成程序逻杖,允許在用戶一個(gè)定義良好的造血細(xì)胞空間中對用戶自己提供的患者樣本進(jìn)行敏感比較奋岁。
(背景,當(dāng)前數(shù)據(jù)庫的缺陷荸百,此數(shù)據(jù)的功能特點(diǎn)以及數(shù)據(jù)特點(diǎn))
INTRODUCTION
-
A decade of intense studies of the genetic programs underlying normal and malignant haematopoiesis has resulted in a number of gene-expression data sets, which can potentially help answer questions concerning the molecular mechanisms governing normal haematopoiesis and how these are de-regulated in cancer.
對正常和惡性造血的遺傳程序進(jìn)行了十年的深入研究闻伶,獲得了許多基因表達(dá)數(shù)據(jù)集,這可能有助于回答有關(guān)控制正常造血的分子機(jī)制以及這些機(jī)制在癌癥中是如何被解除調(diào)控的問題够话。
-
To researchers and clinicians with limited bioinformatics experience, these data have been available through online databases in the form of raw or semi-processed files but remained largely inacces- sible for analysis, let alone comparison with user-supplied in-house data.
對于生物信息學(xué)經(jīng)驗(yàn)有限的研究人員和臨床醫(yī)生來說蓝翰,這些數(shù)據(jù)以原始或半處理文件的形式通過在線數(shù)據(jù)庫提供,但在很大程度上仍然無法進(jìn)行分析女嘲,更不用說與用戶提供的內(nèi)部數(shù)據(jù)進(jìn)行比較了畜份。
-
Recently, a number of web interfaces have been generated to facilitate single gene queries of in-house data (ImmGen Gene Skyline (1), Gene-expression Atlas (2), Leukemia Gene Atlas (3) and Differentiation Map (2)) or curated, compiled and processed data sets (HemaEx- plorer (3), Gene Expression Commons (4), A HeamAtlas (5), BloodChIP (6), BloodExpress (7) and CODEX (8)).
最近,許多web界面生成促進(jìn)單基因內(nèi)部數(shù)據(jù)的查詢(ImmGen Gene Skyline(1)基因表達(dá)圖譜(2)白血病基因圖譜(3)和分化地圖(2))或策劃,編制和處理數(shù)據(jù)集(HemaEx——”(3),Gene Expression Commons(4),A HeamAtlas (5), BloodChIP (6), BloodExpress(7)和CODEX 8))。
-
These tools provide information on the expression of single genes, but fail to answer the main questions as to whether these genes influence patient survival or if genes or pathways are regulated in similar or inverse patterns.
這些工具提供了關(guān)于單個(gè)基因表達(dá)的信息澡为,但無法回答這些基因是否影響患者生存漂坏,或者基因或通路是否以類似或相反的模式被調(diào)控等主要問題。
-
We have previously published a comprehensive database of mRNA microarray samples from FACS sorted healthy and leukemic bone marrow samples (3) which has proven a useful and popular resource for researchers working within the areas of cellular differentiation, haematopoiesis and leukaemia
我們之前發(fā)布了一個(gè)完整的mRNA微陣列樣本數(shù)據(jù)庫媒至,這些樣本來自于FACS分類的健康和白血病骨髓樣本(3)顶别,已經(jīng)被證明是細(xì)胞分化、造血和白血病領(lǐng)域研究人員的有用和流行的資源
(當(dāng)前單個(gè)基因表達(dá)的信息的數(shù)據(jù)庫的缺陷)
-
Here, we present a complete overhaul and significantly expanded version of the original database, with a new and interactive interface, all freely available online.
在這里拒啰,我們將對原始數(shù)據(jù)庫進(jìn)行全面的修改和顯著擴(kuò)展驯绎,并提供一個(gè)新的交互式界面,所有這些都可以在網(wǎng)上免費(fèi)獲得谋旦。
-
The new database redefines current approaches to explorative data integration, presentation and visualisation of gene- expression in the haematopoietic system.
新的數(shù)據(jù)庫重新定義了當(dāng)前的方法剩失,探索數(shù)據(jù)集成,表達(dá)和可視化的基因表達(dá)在造血系統(tǒng)册着。
-
Consequently, all these improvements called for a new name: BloodSpot.
因此拴孤,所有這些改進(jìn)都需要一個(gè)新名字: BloodSpot。
(提出新的數(shù)據(jù)庫:BloodSpot甲捏,以及此數(shù)據(jù)庫特性)
-
The core function of BloodSpot is to provide an expression plot of genes in healthy and cancerous haematopoietic cells at specific differentiation stages.
BloodSpot的核心功能是在特定分化階段為健康和癌變造血細(xì)胞提供基因表達(dá)圖譜演熟。
-
The core function of BloodSpot is to provide an expression plot of genes in healthy and cancerous haematopoietic cells at specific differentiation stages.
血斑的核心功能是在特定分化階段為健康和癌變造血細(xì)胞提供基因表達(dá)圖譜。
-
To present these haematopoietic gene profiles, we have developed a novel visualization chart that simply integrates the benefits of stripcharts and violin plots.
為了展示這些造血基因圖譜司顿,我們開發(fā)了一個(gè)新的可視化圖表芒粹,它簡單地集成了條形圖和小提琴圖的優(yōu)點(diǎn)。
-
The server accepts either a unique gene name (gene alias) or a gene signature name from the MSigDB database.
服務(wù)器接受來自MSigDB數(shù)據(jù)庫的唯一基因名(基因別名)或基因簽名名大溜。
-
Of note, an auto-complete mechanism helps finding the right names for genes and gene signatures.
值得注意的是化漆,自動完成機(jī)制有助于為基因和基因簽名找到正確的名稱。
-
To contextualise the haematopoietic gene expression profile, two additional levels of visualisation are available: an interactive hierarchical tree that shows the relationship between the samples displayed and a Kaplan–Meier plot based on a high-quality Acute Myeloid Leukemia (AML) data set (9). Additionally, we added a large body of curated data sets to the database, which users can query seamlessly.
放到造血的基因表達(dá)譜,另外兩個(gè)級別的可視化是可用的:一個(gè)互動的分層樹顯示樣本關(guān)系,kaplan meier點(diǎn)圖之間的關(guān)系基于一個(gè)高質(zhì)量的急性髓系白血病(AML)數(shù)據(jù)集(9)钦奋。此外,我們添加了大量的策劃數(shù)據(jù)到數(shù)據(jù)庫,用戶可以查詢無縫座云。
-
Significantly, we provide a new integrated data set of samples from AML patients along with FACS sorted samples from healthy individuals.
值得注意的是疙赠,我們提供了一個(gè)新的AML患者樣本的綜合數(shù)據(jù)集,以及來自健康個(gè)體的FACS分類樣本疙教。
-
This new integrated data set provides the most detailed picture of the gene expression landscape in healthy and malignant haematopoiesis to date.
這個(gè)新的綜合數(shù)據(jù)集提供了迄今為止健康和惡性造血中基因表達(dá)的最詳細(xì)的圖景棺聊。
-
Finally, the database provides the possibility of comparing user-supplied leukaemia samples to healthy cells.
最后伞租,該數(shù)據(jù)庫提供了將用戶提供的白血病樣本與健康細(xì)胞進(jìn)行比較的可能性贞谓。
(BloodSpot的特性)
-
The platform is freely available, and requires no login, at: www.bloodspot.eu
該平臺是免費(fèi)的,不需要登錄葵诈,網(wǎng)址:www.bloodspot.eu
DATA CONTENT UPDATES
Available data sets
-
BloodSpot is a database of mRNA expression in healthy andmalignant haematopoiesis and includes data from both humans and mice.
BloodSpot是一個(gè)健康和惡性造血中mRNA表達(dá)的數(shù)據(jù)庫裸弦,包括來自人類和小鼠的數(shù)據(jù)。
-
The database is sub-divided into several data sets that are each accessible for browsing through the new interface.
數(shù)據(jù)庫被細(xì)分為幾個(gè)數(shù)據(jù)集作喘,每個(gè)數(shù)據(jù)集都可以通過新接口進(jìn)行瀏覽理疙。
-
Data sets are organised by organism of origin and disease status.
數(shù)據(jù)集由生物體的起源和疾病狀態(tài)來組織。
-
The data sets are organised as follows: first, human healthy haematopoietic cells, then human leukaemia and finally healthy mouse haematopoietic cells.
數(shù)據(jù)集的組織如下:首先泞坦,人類健康的造血細(xì)胞窖贤,然后是人類白血病,最后是健康的小鼠造血細(xì)胞贰锁。
-
BloodSpot contains the data sets from our previous HemaExplorer (3) as well as new published data sets, all manually processed as described in Rapin et al. (10).
BloodSpot包含我們以前的HemaExplorer(3)和新發(fā)布的數(shù)據(jù)集赃梧,所有這些數(shù)據(jù)集都是按照Rapin等人(10)的描述手動處理的。
-
For completeness, the database also includes the content of other online databases that we deem relevant for the study of haematopoiesis in the framework of BloodSpot.
為了完整豌熄,該數(shù)據(jù)庫還包含了我們認(rèn)為在BloodSpot框架下與造血研究相關(guān)的其他在線數(shù)據(jù)庫的內(nèi)容授嘀。
-
These ex- ternal databases include the Differentiation Map (DMAP) (2) and the Immunological Genome project (ImmGen) (1).
這些外部數(shù)據(jù)庫包括分化圖(DMAP)(2)和免疫基因組計(jì)劃(ImmGen)(1)。
(BloodPool 包含的數(shù)據(jù)集以及數(shù)據(jù)集處理方法)
-
In total the platform encompasses more than 5000 samples (see Tables 1–3).
該平臺總共包含5000多個(gè)示例(見表1-3)锣险。
-
All data sets were controlled for qual- ity, appropriately normalised and adjusted for batch effects when necessary (11,12)
對所有數(shù)據(jù)集進(jìn)行質(zhì)量控制蹄皱,在必要時(shí)進(jìn)行適當(dāng)?shù)臉?biāo)準(zhǔn)化和批量效果調(diào)整(11,12)。
(數(shù)據(jù)總包括的樣本芯肤,以及數(shù)據(jù)處理方法)
table1
BloodPool
-
One new feature of BloodSpot is BloodPool, an aggregated and integrated data set grouping the results of multiple studies focusing on AML.
bloodspot的一個(gè)新特性是血池巷折,這是一個(gè)聚合和集成的數(shù)據(jù)集,將多個(gè)關(guān)注AML的研究結(jié)果進(jìn)行分組崖咨。
-
By means of our batch correction methods this data set can be used to study gene expression (programs) in AML in comparison with healthy corresponding cells (see Figure 1). Using the computational method developed in Rapin et al. (10), we have also computed gene expression fold changes relative to their nearest normal counterparts for all AML profiles in BloodPool.
通過我們的批處理校正方法這個(gè)數(shù)據(jù)集可用于研究基因表達(dá)(程序)在AML與健康相比,相應(yīng)的細(xì)胞(見圖1)锻拘。使用計(jì)算方法為Rapin 等人(10)開發(fā)的,我們也計(jì)算基因表達(dá)倍數(shù)變化相對于最近的正常同行在BloodPool AML的表達(dá)。
-
BloodPool is available for browsing within BloodSpot and can be selected as any of the other available data sets.
BloodPool可以在BloodSpot內(nèi)瀏覽掩幢,并且可以作為任何其他可用數(shù)據(jù)集進(jìn)行選擇逊拍。
MSigDB and CMAP gene signatures integration
-
We collected all gene signatures available from the Molecular Signatures Database (MSigDB) (13) (version 4.0) (http: //www.broadinstitute.org/gsea/msigdb/) and computed, for each signature, the mean expression values for all samples in all data sets.
我們從分子簽名數(shù)據(jù)庫(MSigDB) (13) (version 4.0) (http: //www.broadinstitute.org/gsea/msigdb/)收集了所有可用的基因簽名,并計(jì)算了所有數(shù)據(jù)集中所有樣本的平均表達(dá)值际邻。
-
These mean values summarise the expression of a signature for each sample.
這些平均值概括了每個(gè)樣本的簽名表達(dá)式芯丧。
-
Connectivity map(CMAP) (13) signatures were generated with the rank matrix provided by the database.
使用數(shù)據(jù)庫提供的秩矩陣生成連通性映射(CMAP)(13)簽名。
-
For each combination of compound and concentration, we reported the top and bottom 500 genes and produced gene signatures.
對于每種化合物和濃度的組合世曾,我們報(bào)告了前500和后500個(gè)基因缨恒,并產(chǎn)生了基因簽名谴咸。
-
The data displayed in BloodSpot represent the mean value of all genes in a given signature.
血斑中顯示的數(shù)據(jù)代表了給定簽名中所有基因的平均值。
Data normalisation
-
All data were normalised and batch corrected to eliminate potential lab batch effects.
所有的數(shù)據(jù)都進(jìn)行了標(biāo)準(zhǔn)化和批次校正骗露,以消除潛在的實(shí)驗(yàn)室批次效應(yīng)岭佳。
-
For this we performed Robust Multiarray Average (RMA) (14) normalisation of all mi-croarray .CEL data files partitioned by origin, and next applied ComBat (http://jlab.byu.edu/ComBat/)(12) an empirical Bayes method implemented in the R language.
為此,我們對所有按原始分區(qū)的mi-croarray . cel數(shù)據(jù)文件進(jìn)行了魯棒多陣列平均(RMA)(14)標(biāo)準(zhǔn)化萧锉,然后應(yīng)用于戰(zhàn)斗(http://jlab.byu.edu/ComBat/)(12)用R語言實(shí)現(xiàn)的經(jīng)驗(yàn)貝葉斯方法珊随。
-
The batches were defined to be the study name/number, while the covariates was assigned to the relevant cell type.
批次被定義為研究名稱/編號,而協(xié)變量被分配到相關(guān)的細(xì)胞類型柿隙。
-
The resulting integrated gene expression databases can be visualised directly or compared to external samples provided by the user.
由此產(chǎn)生的集成基因表達(dá)數(shù)據(jù)庫可以直接可視化叶洞,也可以與用戶提供的外部樣本進(jìn)行比較。
-
See Tables 1–3 for an overview of the data presented in BloodSpot and the normalisation procedure used.
見表1-3所示的血斑數(shù)據(jù)和使用的正迟餮拢化程序的概述衩辟。
-
All AML data sets available in BloodSpot are normalised according to Rapin et al. (10) and further batch corrected using ComBat when necessary.
根據(jù)Rapin等人(10)的說法,血斑中所有可用的AML數(shù)據(jù)集都被歸一化波附,并在必要時(shí)使用戰(zhàn)斗修正進(jìn)一步的批次艺晴。
-
This processing schema ensures that the samples are normalised in the context of normal haematopoiesis and according to state of the art batch correction methods, regardless of the origin of the data.
這種處理模式確保樣本在正常造血的背景下,并根據(jù)最新的批量校正方法進(jìn)行標(biāo)準(zhǔn)化掸屡,而不考慮數(shù)據(jù)的來源封寞。
-
For RNA-seq data, we used the Blue Collar Bioinformatics RNA-seq pipeline (mapping on mm10 mouse genome with TopHat version 2 (15), (https://bcbio-nextgen.readthedocs.org/)) to obtain normalised count data from raw fastq files from Lara-Astiaso et al. (16).
對于RNA-seq數(shù)據(jù),我們使用藍(lán)領(lǐng)生物信息學(xué)RNA-seq管道(用TopHat version 2(15)在mm10小鼠基因組上進(jìn)行映射折晦,(https://bcbio-nextgen.readthedocs.org/))從Lara-Astiaso等人的原始fastq文件中獲得標(biāo)準(zhǔn)化計(jì)數(shù)數(shù)據(jù)(16)钥星。
-
We report count data processed using the variance stabilising transformation method from the DESeq2 package (17).
我們報(bào)告了從DESeq2包中使用方差穩(wěn)定轉(zhuǎn)換方法處理的計(jì)數(shù)數(shù)據(jù)(17)。
Abbreviations and sample annotations
-
Abbreviations for all cell types can be found below the plot by clicking the ‘Abbreviations’ link.
通過單擊“縮寫”鏈接满着,可以在圖下方找到所有單元格類型的縮寫谦炒。
-
Typically, the user can find more detailed information about each cell type such as a longer, more informative name, and for healthy cells data sets the immunophenotype, when available.
通常,用戶可以找到關(guān)于每種細(xì)胞類型的更詳細(xì)的信息风喇,比如更長的宁改、更有信息的名稱,以及健康細(xì)胞數(shù)據(jù)集的免疫表型(如果有的話)魂莫。
-
Links to the raw unprocessed data can also be found here.
到原始未處理數(shù)據(jù)的鏈接也可以在這里找到还蹲。
Available genes
-
The server is restricted to genes found in our database of Affymetrix Human 133U plus 2, Affymetrix Human 133UA and Affymetrix Human 133UB chips for human, and GeneChip Mouse Genome 430 2.0 and Affymetrix Mouse Gene 1.0 ST Arrays for mouse.
服務(wù)器僅限于我們數(shù)據(jù)庫中發(fā)現(xiàn)的Affymetrix Human 133U plus 2、Affymetrix Human 133UA和Affymetrix Human 133UB人類芯片耙考、GeneChip小鼠基因組430 2.0和Affymetrix小鼠基因1.0 ST組小鼠基因谜喊。
-
For the RNA-seq data set UCSC annotation for the mm10 genome was used.
對于RNA-seq數(shù)據(jù)集,使用了mm10基因組的UCSC注釋倦始。
-
In order to handle gene aliases, a dictionary of gene aliases was constructed from NCBI ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ and The HUGO Gene Nomencla- ture Committee (HGNC)www.genenames.org.
為了處理基因別名斗遏,從NCBI ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/和HUGO基因命名委員會(HGNC)www.genenames.org構(gòu)建了一個(gè)基因別名字典。
-
Ambiguous gene aliases were not included when constructing the dictionary.
在構(gòu)建字典時(shí)沒有包含模糊的基因別名鞋邑。
-
The alias conversion is only used when the query is not an official gene symbol or probe name.
別名轉(zhuǎn)換僅在查詢不是正式的基因符號或探針名稱時(shí)使用诵次。
-
The end result allows for greater flexibility regarding gene names input and faster browsing.
最終的結(jié)果允許在基因名稱輸入方面有更大的靈活性和更快的瀏覽速度账蓉。
FUNCTIONALITY UPDATES
-
Both the backend and the front-end have been completely redesigned for interactive usage and speed ofexecution.
后端和前端都已完全重新設(shè)計(jì),以適應(yīng)交互使用和執(zhí)行速度逾一。
-
The interface is built with a range of new functionalities, with a focus on simplicity of use (see Figure 2).
該接口使用一系列新功能構(gòu)建铸本,重點(diǎn)是使用的簡單性(參見圖2)。
fig2
Unified input
-
BloodSpot takes a single gene name (or unambiguous gene alias) or gene signature name as query.
BloodSpot以單個(gè)基因名(或明確的基因別名)或基因簽名名作為查詢遵堵。
-
Users can search for keywords such as ‘carcinomas’ or ‘cell cycle’ and will be provided with a list of matching gene signature names.
用戶可以搜索“癌”或“細(xì)胞周期”等關(guān)鍵詞箱玷,并將獲得匹配的基因簽名名稱列表。
-
When relevant, it is possible to select which probe-set to display from the list in the upper right corner of the main plot.
當(dāng)相關(guān)時(shí)鄙早,可以從主圖右上角的列表中選擇要顯示的probe-set汪茧。
-
By default, the probe with the overall highest intensity is at the top of the list.
默認(rèn)情況下,總體強(qiáng)度最高的探測位于列表的頂部限番。
-
The option ‘Max probe’ will use the one probe with the highest intensity within each population.
選項(xiàng)“Max probe”將使用每個(gè)種群中強(qiáng)度最高的一個(gè)探針。
Default plot
-
When visiting the interface the plot at the centre of the screen in the default view.
當(dāng)訪問界面時(shí)呀舔,默認(rèn)視圖中屏幕中心的繪圖弥虐。
-
This representation is a novel improved jitter strip chart of gene expression, a swift novel visualisation plot that draws from bar plots and violin plots where the jitter is controlled by the density of samples and normalised over all the columns in the chart.
這是一種新的改良的抖動基因表達(dá)條帶圖,一種快速的可視化圖媚赖,它從條形圖和小提琴圖中提取霜瘪,其中抖動由樣本的密度控制,并在圖中的所有列上進(jìn)行標(biāo)準(zhǔn)化惧磺。
-
Thus the width of the data cloud shows how many samples have similar values (see Figure 3A and a comparison to existing data plot types in Supplementary Figure S1).
因此颖对,數(shù)據(jù)云的寬度顯示了有多少樣例具有相似的值(參見圖3A和補(bǔ)充圖S1中與現(xiàn)有數(shù)據(jù)圖類型的比較)。
fig3A
-
For more details on this visualisation method please see (Sidiropoulos, N.,Sohi, S.H., Rapin, N. and Bagger, F.O. (2015) SinaPlot: an enhanced chart for simple and truthful representation of single observations over multiple classes.bioRxiv, http://dx.doi.org/10.1101/028191).
有關(guān)這種可視化方法的更多細(xì)節(jié)磨隘,請參見(Sidiropoulos, N.Sohi,薩達(dá)姆政權(quán),Rapin, N.和Bagger, F.O. (2015) SinaPlot:一個(gè)簡單而真實(shí)地表示多個(gè)類的單個(gè)觀察的增強(qiáng)圖表缤底。bioRxiv,http://dxdoi.org/10.1101/028191)。
-
Both an R-package and a web- server have been implemented for those interested in make use of this plot type that we have named SinaPlot.
R-package和web- server都已經(jīng)為那些有興趣使用我們命名為SinaPlot的plot類型的用戶實(shí)現(xiàn)了番捂。
Survival plot
-
The chart shown to the left of the BloodSpot interface is a survival plot based on a high-quality AML data set from The Cancer Genome Atlas (TCGA).
血斑界面左側(cè)的圖表是基于癌癥基因組圖譜(TCGA)的高質(zhì)量AML數(shù)據(jù)集繪制的生存圖个唧。
-
It displays a full Kaplan–Meier analysis of survival.
它展示了一個(gè)完整的Kaplan-Meier生存分析。
-
The survival plots are only available for human data sets, sharing probes with the microarray platform used by the TCGA (Affymetrix U133 Plus 2) (see Figure 3B)
生存圖只適用于人類數(shù)據(jù)集设预,與TCGA (Affymetrix U133 + 2)使用的微陣列平臺共享探針(見圖3B)
fig3B
Tree plot
-
The chart shown to the right of the BloodSpot interface is an interactive hierarchical tree that shows the relationship between the samples displayed and allows changing the focus of the display.
BloodSpot界面右側(cè)顯示的圖表是一個(gè)交互式層次樹徙歼,它顯示顯示的樣本之間的關(guān)系,并允許更改顯示的焦點(diǎn)鳖枕。
-
It is possible to mouse over the nodes to get the full name for long names.
可以將鼠標(biāo)移到節(jié)點(diǎn)上以獲得長名稱的全名魄梯。
-
Nodes can be clicked to collapse a branch of the tree––this will also update the default plot in the middle and remove the same populations there (see Figure 3C).
可以單擊節(jié)點(diǎn)來折疊樹的一個(gè)分支——這也將更新中間的默認(rèn)圖,并刪除相同的填充(參見圖3C)宾符。
fig3C
Correlation of genes and gene signatures
-
For each gene and signature in every data set, we report the top correlating genes or signatures.
對于每個(gè)數(shù)據(jù)集中的每個(gè)基因和簽名酿秸,我們報(bào)告了頂部相關(guān)的基因或簽名。
-
Taking the haematopoietic fingerprint (e.g. the expression value of one gene over all haematopoietic cells) of all probe-sets and gene signatures in a given data set, we calculated the correlation matrix (Pearson) and present the highest positive and negative correlating genes/signatures.
取給定數(shù)據(jù)集中所有探針組和基因簽名的造血指紋圖譜(例如一個(gè)基因在所有造血細(xì)胞上的表達(dá)值)吸奴,計(jì)算相關(guān)矩陣(Pearson)允扇,并給出最高的正相關(guān)和負(fù)相關(guān)基因/簽名缠局。
-
This feature allows for investigation of new associations between putative co-regulated genes or signatures that exhibit similar or inverse expression patterns over the course ofhaematopoiesis (see Figure 3D).
這一特性允許研究在造血過程中表現(xiàn)出相似或相反表達(dá)模式的假定共同調(diào)控基因或特征之間的新關(guān)聯(lián)(見圖3D)。
fig3D
Other built-in tools
-
Cell populations may be removed from the graphs using the ‘Select population’ button.
可以使用“選擇種群”按鈕從圖中刪除細(xì)胞種群考润。
-
The current plot displayed can be exported as PDF in publication-ready quality using the ‘Print as PDF’ button.
現(xiàn)時(shí)顯示的圖則可使用“列印為PDF”按鈕狭园,以PDF格式輸出,以備出版糊治。
-
The ‘T-Test’ button can be used to add the results from a students t-test for significance be- tween pairs of populations to the plot.
“T-Test”按鈕可用于將學(xué)生T-Test的結(jié)果添加到圖中唱矛,以確定兩組總體之間的顯著性。
-
The legend is as fol- lowing: NS: non-significant;P < 0.05; * * P < 0.01;**P < 0.001井辜。
圖例如下:NS:無顯著性;P < 0.05; * * P < 0.01;**P < 0.001绎谦。
-
The significance marks relies on t statistics for un- equal sample sizes but assuming equal variance and the critical values are compared with a two-tailed probability.
顯著性標(biāo)記依賴于樣本大小不相等時(shí)的t統(tǒng)計(jì)量,但假設(shè)方差和臨界值相等粥脚,并與雙尾概率進(jìn)行比較窃肠。
-
Finally, raw data can be exported as CSV using the ‘Export Data as Text’ button.
最后,原始數(shù)據(jù)可以導(dǎo)出為CSV使用' Export data as Text '按鈕刷允。
Upload sample
-
The analysis is anonymous and requires no login.
分析是匿名的冤留,不需要登錄。
-
The resulting data set, including the uploaded sample, can then be queried along with the default data sets in a private session.
然后树灶,可以在私有會話中查詢結(jié)果數(shù)據(jù)集(包括上載的示例)和默認(rèn)數(shù)據(jù)集纤怒。
-
All names and array information are stripped from the uploaded file before creating the database for the user session.
在為用戶會話創(chuàng)建數(shù)據(jù)庫之前,將從上傳的文件中刪除所有名稱和數(shù)組信息天通。
-
Hence, the uploaded sample in the private session will appear simply as S 1 in all charts.
因此泊窘,私有會話中上傳的示例將在所有圖表中顯示為s1。
-
The private sessions and uploaded data are deleted every day at GMT 1.30 pm.
每天下午1點(diǎn)30分像寒,私人會議和上傳的數(shù)據(jù)都會被刪除烘豹。
EXAMPLES OF USE OF BLOODSPOT
-
To demonstrate the use ofBloodSpot, we provide in the fol- lowing section an example relying on data and analysis pro- vided by the database.
為了演示血斑的使用,我們在下一節(jié)提供了一個(gè)基于數(shù)據(jù)庫提供的數(shù)據(jù)和分析的例子萝映。
-
MEIS1 is part of a transcriptional program required for the maintenance of MLL-rearranged AML (18).
MEIS1是維持ml -重排AML(18)所需的轉(zhuǎn)錄程序的一部分吴叶。
-
The ex- pression ofthis gene is therefore often up-regulated inMLL leukaemias.
因此,該基因的表達(dá)在mll白血病中經(jīng)常上調(diào)序臂。
-
Using Bloodspot, we investigated the expression pattern of MEIS1, and found it to be expressed at high levels in stem cells with decreasing expression as the cells differentiate (Figure 3A and C). Using the correlation function, we find that MEIS1 expression also correlates with the expression patterns of a number of Homeobox genes, including HOXA3, HOXA9 and HOXA10 which are also typically expressed early during haematopoiesis (19)(Fig- ure 3D).
使用Bloodspot,我們調(diào)查MEIS1的表達(dá)模式,并發(fā)現(xiàn)它被表達(dá)在干細(xì)胞表達(dá)高水平與在細(xì)胞分化表達(dá)減少(圖3 a和C)蚌卤。使用相關(guān)函數(shù),我們發(fā)現(xiàn)MEIS1表達(dá)式也與許多同源框基因的表達(dá)模式,包括HOXA3 HOXA9和HOXA10通常也表示在早期造血作用(19)(圖-保證3 d)。
-
Switching to the BloodPool data set, MEIS1 is found to be up-regulated in MLL leukaemias (Figure 4). Although the P-value in the survival plot does not reach statistical significance (0.055;
切換到血池?cái)?shù)據(jù)集奥秆,我們發(fā)現(xiàn)MEIS1在MLL白血病中上調(diào)(圖4)逊彭,雖然生存圖中的p值沒有達(dá)到統(tǒng)計(jì)學(xué)意義(0.055;
-
see Figure 3B), the influence of MEIS1 expression in leukemic patients may be of potential relevance
見圖3B), MEIS1在白血病患者中的表達(dá)可能具有潛在的相關(guān)性
DISCUSSION
-
Here we have presented a web-based database that al- lows for browsing of haematopoietic gene-expression fingerprints in human, murine and malignant haematopoiesis in a large number of high-quality data set containing several hematopoietic cell types.
在此构订,我們提出了一個(gè)基于web的數(shù)據(jù)庫侮叮,它可以在包含多種造血細(xì)胞類型的大量高質(zhì)量數(shù)據(jù)集中瀏覽人類、小鼠和惡性造血中造血基因表達(dá)指紋圖譜悼瘾。
-
The tool facilitates the easy assess- ment of gene-expression data and how this links to patient survival, investigation ofgene-expression signatures, as well as analysis of user generated data and export of data and figures.
該工具易于評估基因表達(dá)數(shù)據(jù)以及如何將其與患者生存囊榜、基因表達(dá)特征的調(diào)查审胸、用戶生成數(shù)據(jù)的分析以及數(shù)據(jù)和圖形的導(dǎo)出聯(lián)系起來。
-
Focusing on simplicity, BloodSpot has features that allow clinicians or biologists to quickly retrieve relevant in- formation on the expression ofspecific genes/pathways, and further explore co-regulated patterns of gene-expression as well as impact on patient survival.
以簡單性為重點(diǎn)卸勺,BloodSpot的特點(diǎn)使臨床醫(yī)生或生物學(xué)家能夠快速檢索特定基因/通路表達(dá)的相關(guān)信息砂沛,并進(jìn)一步探索基因表達(dá)的協(xié)同調(diào)控模式以及對患者生存的影響。
-
Our statistical frame- work supports the upload ofuser-generated patient data for integration and comparison with our database of healthy cells.
我們的統(tǒng)計(jì)框架工作支持上傳用戶生成的病人數(shù)據(jù)曙求,以便與我們的健康細(xì)胞數(shù)據(jù)庫進(jìn)行集成和比較碍庵。
-
This will allow assessment of the origin of the blast population in AML patients as well as assessment of well known and novel genetic markers in the context of normal haematopoiesis, both ofwhich could be important for stratification of difficult patient cases.
這將有助于評估AML患者中blast群體的起源,以及在正常造血過程中評估已知的和新的遺傳標(biāo)記悟狱,這兩種方法對困難病例的分層可能都很重要静浴。
-
We have also integrated the largest pool ofAML patient microarray samples to date and have computed gene ex- pression fold changes for these profiles, thanks to our cancer versus normal method previously described in (10)and curation and labelling of external data followed by ComBat (12).
我們還整合了迄今為止最大的aml患者微陣列樣本池,并計(jì)算了這些基因表達(dá)譜的基因表達(dá)倍數(shù)變化挤渐,這得益于我們之前(10)中描述的癌癥與正常方法的對比苹享,以及外部數(shù)據(jù)的篩選和標(biāo)記,以及隨后的 ComBat(12)挣菲。
-
In conclusion, we have curated and populated a database and developed an analysis platform, which will allow researchers as well as clinicians to access and analyse gene expression data related to both normal and malignant haematopoiesis.
綜上所述富稻,我們已經(jīng)策劃并填充了一個(gè)數(shù)據(jù)庫,并開發(fā)了一個(gè)分析平臺白胀,該平臺將允許研究人員和臨床醫(yī)生訪問和分析與正常和惡性造血相關(guān)的基因表達(dá)數(shù)據(jù)。
-
We believe that the database should be of interest to all researchers and clinicians interested in haematopoiesis, leukaemia, basic immunology and gene expression in developmental systems.
我們相信這個(gè)數(shù)據(jù)庫應(yīng)該引起所有對造血抚岗、白血病或杠、基礎(chǔ)免疫學(xué)和發(fā)育系統(tǒng)中的基因表達(dá)感興趣的研究人員和臨床醫(yī)生的興趣。
-
Additional to information on gene-expression BloodSpot addresses two key questions, namely, how gene-expression patterns of single genes impact on patient survival, and which other genes display similar expression patterns in the haematopoietic system.
除了關(guān)于基因表達(dá)血斑的信息外宣蔚,還解決了兩個(gè)關(guān)鍵問題向抢,即單個(gè)基因的基因表達(dá)模式如何影響患者的生存,以及哪些其他基因在造血系統(tǒng)中表現(xiàn)出類似的表達(dá)模式胚委。
-
Thus the platform will help broaden the basis on which to generate hypotheses about potential therapeutic targets and expand the understanding of co-regulated genes and pathways, to support experimen- tal findings from animal model systems
因此挟鸠,該平臺將有助于拓寬基礎(chǔ),在此基礎(chǔ)上產(chǎn)生關(guān)于潛在治療目標(biāo)的假設(shè)亩冬,并擴(kuò)大對共同調(diào)控基因和通路的理解艘希,以支持來自動物模型系統(tǒng)的實(shí)驗(yàn)結(jié)果。
AVAILABILITY
Bloodspot is accessible at www.bloodspot.eu
SUPPLEMENTARY
Supplementary Data are available at NAR Online.
ACKNOWLEDGEMENTS
This work was supported by a grant from the Danish Re- search Council for Strategic Research, as well as through a centre grant from the NovoNordisk Foundation (The Novo Nordisk Foundation section for Stem Cell biology in Hu- man Disease). Furthermore, F.O.B. was supported by the Lundbeck foundation. We thank Nicolas Hillau for the an- imated logo of BloodSpot