參考文章是:The cis-regulatory dynamics of embryonic development at single-cell resolution
概述
單細胞水平的發(fā)育譜系調(diào)控機制:
Here we investigate the dynamics of chromatin regulatory landscapes during embryogenesis at single-cell resolution
作者利用他們以前開發(fā)的sci-ATAC技術,對果蠅胚胎超過20000個單細胞核追蹤三個主要發(fā)育階段的染色質(zhì)圖譜:
2–4 h after egg laying (predominantly stage 5 blastoderm nuclei), 此時的胚胎由約6000個多能細胞組成
6–8 h after egg laying (predominantly stage 10–11)辕近,此時中胚層和內(nèi)胚層的基本譜系已確定
10–12 h after egg laying (predominantly stage 13), 此時每個胚胎的20000多個細胞正進行終端分化
Results
對于每個發(fā)育階段鲸阻,所采樣的細胞核來自幾百個胚胎(雌雄都有)窍霞,數(shù)據(jù)的質(zhì)量衡量包括barcode的reads數(shù)分布熊痴、片段大小踢俄、平均每個細胞的reads數(shù)蚊逢、與以前的研究中定義的DHS(DNase超敏位點)的cover率等等层扶。
通過把基因組分成2-kb的windows,對每個細胞的每個bin進行評分烙荷,選取其中20,000 most frequently accessible windows使用LSI(latent semantic indexing)算法初步聚類(因為單細胞數(shù)據(jù)的稀疏特征镜会,需要合并一些bins的數(shù)據(jù))
如上圖,根據(jù)細胞全部windows的相似性進行聚類终抽,聚類獲得不同cell clade戳表,每個clade的細胞的reads即可合并作為peaks calling的input。
clade的處理流程:合并clade的reads昼伴,進行peaks calling匾旭,類似差異分析的方法尋找clade特異的peaks,這些peaks再用于尋找peaks link(比如enhancer和其相關聯(lián)的promoter)圃郊,結合enhancer胚胎活性數(shù)據(jù)庫价涝、基因表達數(shù)據(jù)庫驗證這些links
每個clade中富集的特異peaks及l(fā)inks可以用來鑒定不同的clade并用來做注釋,舉例來說:
mesoderm is split into myogenic mesoderm (clade 3) and non-myogenic mesoderm (such as fat body and haemocytes) combined with endoderm (clade 4). The latter indicates that non-myogenic mesoderm and endoderm exhibit similar chromatin accessibility, suggesting a shared developmental program.
注釋以后持舆,作者發(fā)現(xiàn)原來的non-myogenic mesoderm和 endoderm是合并在同一個clade里面的色瘩,說明這兩種類型可能有共有的shared developmental program. 不過就以前的知識,Drosophila的中胚層和內(nèi)胚層并沒有共同的起源逸寓,作者給出的解釋是:可能是進化遺留的特征
Although, to our knowledge, Drosophila mesoderm and endoderm have not been shown to share a common origin, this is highly reminiscent of the mesendoderm lineage in Caenorhabditis elegans, sea urchins and vertebrates.
為了驗證clade assignments的可靠性居兆,作者結合了FACS和motif enrichment進行了驗證:
上圖e中用FACS篩選特定細胞類群,然后進行(DNase-seq)竹伸,與sci-ATAC的結果進行比較
t-SNE聚類以及每個cluster的different developmental stages泥栖,這里作者對2-4h的細胞進行t-SNE聚類,發(fā)現(xiàn)細胞的聚類結果和不同發(fā)育階段的細胞類型高度一致佩伤,在下圖b中聊倔,展示了每個cluster都有的階段特異enhancer活性,說明developmental time是造成這些細胞聚成不同類的主要因素生巡。
當然還是少不了trajectory擬時分析:可以看到在上圖c中耙蔑,細胞最終分成了三支,分別對應ectoderm孤荣、endoderm甸陌、mesoderm须揣,說明在2-4h的發(fā)育階段后期,細胞逐漸分化成三支
Notably, the trajectory split cells into three major branches that were consistent with our annotations of the major germ layers (neuronal cells are rare at this time point, as expected)
利用擬時分析钱豁,還可以鑒定出那些在擬時序列中動態(tài)變化的enhancer和gene loci耻卡,比如slam這個基因逐漸關閉的動態(tài):
For example, the most significant closing site (P value = 5 × 10?224) is within the slam locus, a gene that is essential for blastoderm cellularization during a very brief temporal window
上圖d展示了隨擬時序列變化的brancn specific sites,e和f展示的是對應的anterior enhancer和posterior enhancer分別驅動具有空間表達特異性的gap genes(knirps和giant)的表達牲尺,說明:
sci-ATAC-seq can identify regulatory regions that are specifically accessible in spatially refined subsets of cells without the need for FACS sorting.
胚胎發(fā)育研究中經(jīng)典的lineage-tracing和transplantation experiments都揭示了細胞命運決定主要發(fā)生在blastoderm時期卵酪,因此有了blastoderm fate map這一概念。利用sci-ATAC技術谤碳,進一步很好地描繪了胚胎發(fā)育時期染色質(zhì)可及性的空間異質(zhì)性溃卡。spatial heterogeneity in chromatin accessibility.
作者進一步對lineage commitment(6–8 h) 和differentiation (10–12 h)兩個胚胎發(fā)育時期進行探索。通過對較晚期胚胎發(fā)育階段的數(shù)據(jù)應用t-SNE蜒简,展示出更精細的細胞類型瘸羡、組織圖譜(這也與細胞進入分化的特征相符合)
Clusters were annotated based on overlaps between cluster-enriched peaks and enhancers or genes with known tissue-specific activity.
從上圖中可以發(fā)現(xiàn)clade和更精細的cell annotation的對應關系、包含關系(比如a圖的mesendoderm分支成8搓茬、16犹赖、14三支)
A major advantage of profiling chromatin accessibility is its potential to identify distal regulatory elements that shape gene expression.
為了驗證那些組織特異性peaks確實是一些組織特異的enhancer,作者進行了胚胎轉基因實驗(體內(nèi)enhancer活性驗證卷仑,lacZ reporter gene)峻村,大概就是對candidate基因組區(qū)域進行PCR擴增、克隆至hsp70 promoter(驅動lacZ報告基因)上游系枪,然后注射入胚胎雀哨、整合。后期在candidate有活性的胚胎區(qū)域將會有報告基因表達
We obtained 31 transgenic lines, representing six candidate regions with specific accessibility in neurogenic ectoderm, ten in non-neurogenic ectoderm, eight in myogenic mesoderm and seven in non-myogenic mesoderm plus endoderm.
candidate的篩選根據(jù)不同clade的開放peaks進行選擇私爷。作者發(fā)現(xiàn)一些mesendoderm,clade4的candidates同樣在yolk nuclei也有活性膊夹,然而yolk實際上是胚外組織衬浑,理論上不應該有報告基因表達,作者給出的解釋是:
As the yolk is extra- embryonic, this was unexpected, and suggests a potential regulatory link between the yolk and mesendodermal tissues, which is supported by the role of the GATA transcription factor serpent in both yolk and nonmyogenic mesoderm
總結
sci-ATAC-seq不僅能解釋胚胎發(fā)育過程中動態(tài)的染色質(zhì)可及性放刨,還能大規(guī)模地預測體內(nèi)活性的enhancer工秩。作者還提供了一個網(wǎng)頁工具:http://shiny.furlonglab.embl.de/scATACseqBrowser/
Our ability to understand how changes in the regulatory landscape underlie lineage commitment would be greatly aided by the concurrent measurement of chromatin accessibility and transcription.
In the long term, the integration of chromatin state, transcriptional output, lineage history, and spatial information at single-cell resolution has the potential to unlock how an organism’s genome encodes its development.
整合單細胞水平的染色質(zhì)狀態(tài)、轉錄譜进统、發(fā)育軌跡助币、空間信息等數(shù)據(jù),將進一步有利于解答發(fā)育生物學的問題
參考文獻
The cis-regulatory dynamics of embryonic development at single-cell resolution:https://doi.org/10.1038/nature25981