10X單細(xì)胞（10X空間轉(zhuǎn)錄組）軌跡分析（擬時分析）VECTOR之文獻(xiàn)分享

hello贱鼻，昨天我們分享了VECTOR的示例代碼酱床，文章在10X單細(xì)胞（10X空間轉(zhuǎn)錄組）軌跡分析（擬時分析）之VECTOR,2020年8月發(fā)表于Cell Reports除呵，對于其原理衡楞，我們還是需要認(rèn)真總結(jié)一下的薄疚，這篇短文就讓我們來分享一下這篇文獻(xiàn)访诱，把握重點(diǎn)，看看這個軟件的特點(diǎn)及運(yùn)用情況锋拖，對軟件的把握做到心中有數(shù)诈悍。

SUMMARY

A key step in trajectory inference is the determination of starting cells（這個大家應(yīng)該深有體會祸轮，所以做個性化分析之前都是需要細(xì)胞定義的）, which is typically done by using manually selected marker genes（目前大多數(shù)細(xì)胞定義的方法還是依賴于人工選擇marker，相似性映射的方法目前問題太多）. In this study, we find that the quantile polarization（分位數(shù)極化侥钳？适袜？？） of a cell’s principal-component values is strongly associated with their respective states in development hierarchy（主成分的value與細(xì)胞發(fā)育狀態(tài)相關(guān)）, and therefore provides an unsupervised solution for determining the starting cells（這個地方需要深入研究一下）. Based on this finding, we developed a tool named VECTOR that infers vectors of developmental directions for cells in UniformManifold Approximation and Projection (UMAP). In seven datasets of different developmental scenarios, VECTOR correctly identifies the starting cells and successfully infers the vectors of developmental directions. VECTOR is freely available for academic use at https://github.com/jumphone/Vector.（運(yùn)用示例很好慕趴，每篇文章都是這么說的）痪蝇。

INTRODUCTION

這個地方我們提煉一下

TI方法的算法（monocle，PAGA冕房，slingshot等躏啰，這幾個軟件大家都應(yīng)該很熟悉）設(shè)計有兩個共同的組成部分：

the use of dimensional reduction, clustering, or graph-building techniques to convert scRNA-seq data into a simplified representation of trajectory, and the ordering of cells along the trajectory.（降維聚類，很常規(guī)）
there may be many alternative trajectories to choose from, most TI methods require the use of prior information, such as a set of known marker genes, to determine the starting cells (SCs) of the correct trajectory.（說白了耙册，需要做細(xì)胞定義來決定發(fā)育的起點(diǎn)给僵，不做細(xì)胞定義的軌跡分析都是耍流氓）
marker的人為主觀選擇確實(shí)存在很大的誤差，Recently, a new study found that RNA velocity（RNA Velocyto確實(shí)這個方面做的不錯详拙，人為干預(yù)減少）帝际，the time derivative of gene expression states, could be estimated by modeling the relationship between unspliced and spliced mRNAs, making it possible to deduce the future transcriptional states of cells and consequently the developmental trajectories without the need of prior information for determining SCs（依據(jù)可變剪切來推斷發(fā)育軌跡，這個方法高分文獻(xiàn)經(jīng)常用到）饶辙，在沒有使用任何先驗信息的情況下蹲诀，使用RNA速度鑒定了神經(jīng)c譜系細(xì)胞的新型發(fā)育模型，證明了其在發(fā)育譜系分析中的有用性弃揽。

看一下RNA velocyto的缺點(diǎn)

reanalyze raw sequencing data to determine intron reads for quantifying unspliced mRNAs, which is time-consuming and sometimes may not be possible because of the limitation of the sequencing platforms.（這也不算什么缺點(diǎn)）脯爪。

現(xiàn)在做單細(xì)胞分析確實(shí)PCA分析是必需的，Cells at different developmental states have been shown to
have distinct patterns of PC values.However, the patterns of a cell’s PC values have not yet been fully explored in the current TI methods.（這個地方作者持保留意見）矿微，In this study, we observed that the averaged polarization of a cell’s PC values across a large number of PC subspaces is strongly correlated with their developmental states, with SCs having the most polarized PC values.（這個地方需要注意一下痕慢，不知道大家注意過沒有，初始細(xì)胞的PC值很特別么涌矢？掖举？待會看看看方法），We thus provided an unsupervised solution for determining the SCs based on the averaged polarization of a cell’s PC values.（依據(jù)PC值來確定發(fā)育起點(diǎn)娜庇，這個方法不能說是無監(jiān)督塔次，必須半監(jiān)督），當(dāng)然名秀，作者的示例當(dāng)然很不錯励负，我們自己用需要點(diǎn)注意了。

Result

第一步是拿定義好的兩個單細(xì)胞數(shù)據(jù)集驗證軟件的可靠性

我們做PCA分析的時候泰偿，一般選擇前十幾個PCA做下游的分析，Seurat本身會計算50個PCA蜈垮，作者這個地方采用的卻是150個PCA耗跛，這個地方依據(jù)是什么裕照，需要在方法中看看了。

圖片.png

在數(shù)據(jù)集分析中發(fā)現(xiàn)调塌，F(xiàn)or both oligodendrocyte and enterocyte lineages, we found that cells at earlier developmental stages tend to have more extreme PC values（更極端的PCA值）(either very small or very large—i.e., highly polarized（極化原來是這個意思晋南，服了）），while those at later developmental stages tend to have more intermediate PC values（這個規(guī)律還真沒注意過羔砾，需要拿自己的數(shù)據(jù)來嘗試一下了）负间。such patterns were more obvious if we inspected the density of the PC value quantiles at all 150 PC subspaces for cells at different developmental stages。（看圖規(guī)律倒是很明顯）

圖片.png

To quantify the polarization of the PC value quantiles, we next defines a quantile polarization (QP) score that averages the polarization of the PC value quantile of a given cell across all 150 PC subspaces（QP的定義姜凄，這個方式講道理政溃，我還是第一次見），然后QP的值很發(fā)育層級相關(guān)性很高态秧，with cells at the earliest developmental stages having the greatest QP scores董虱。

圖片.png

We further experimented with using a different number of PCs, and found that such correlations were robust if the number of PCs used could explain ~20%–80% of the total variance。

UMAP直接推斷軌跡發(fā)生申鱼，這個在monocle3軟件中有運(yùn)用

In essence, VECTOR treats a twodimensional UMAP representation of cells as an image and splits it into a number of pixels. After removing those pixels that do not include any cells, VECTOR focuses on the largest connected pixel (LCP) network in UMAP to infer developmental directions.（看來這個軟件這是在UMAP圖上進(jìn)行軌跡的推斷）愤诱。By averaging the QP scores of cells inside each pixel, VECTOR identifies the high-scoring pixels that have the greatest QP scores (top 10% by default).（PCA的極化值推斷發(fā)育起點(diǎn)的細(xì)胞），作者也提到了這個方法可能會存在假陽性捐友，Here, VECTOR considers not only QP scores but also the connectivity of cells in UMAP; from the high-scoring pixels, it selects the largest connected high scoring pixels as the starting point of development. （聯(lián)合UMAP的分析結(jié)果進(jìn)行綜合分析淫半，得到發(fā)育起點(diǎn)的細(xì)胞），Those isolated high-scoring pixels that are likely false positives are then filtered out.（這個地方其實(shí)有bug）匣砖。For each pixel in the LCP network, VECTOR computes a pseudotime score defined as
its network distance to the starting point of development（大部分軟件都是這么計算的）科吭。Finally, for a given target pixel VECTOR computes a vector (with arrow and length) by taking into consideration the information of all pixels in the LCP network, including the direction of the unit vector pointing from a selected pixel to the target pixel, the relative pseudotime score between the target pixel and the selected pixel, and the closeness of the selected pixel to the target pixel in the LCP network, and so on.（分析結(jié)果得到類似RNA Velocyto的圖）。箭頭的方向就是發(fā)育的方向脆粥，臨近發(fā)育起點(diǎn)和發(fā)育中期砌溺，箭頭較短，臨近發(fā)育終點(diǎn)箭頭較長变隔。

運(yùn)用示例

剛才定義好的兩個數(shù)據(jù)集表現(xiàn)很好规伐，成功識別了發(fā)育起點(diǎn)和軌跡

圖片.png

運(yùn)用到其他示例數(shù)據(jù)，效果也不錯

圖片.png

Vector 和 RNA Velocyto的比較

圖片.png

Vector效果更好匣缘，RNA Velocyto有截斷猖闪，which may be caused by the lack of intron reads in these cells.當(dāng)然，Velocyto也很難識別發(fā)育的起點(diǎn)肌厨。

接下來是運(yùn)用到多發(fā)育分支的數(shù)據(jù)

圖片.png

效果不錯培慌。當(dāng)然，軟件也提供了人工選擇發(fā)育起點(diǎn)的功能柑爸。

Method

The workflow of VECTOR

Given a two-dimensional UMAP representation of cells, VECTOR treats it as an image, and then splitting it into a number of pixels. We provide a parameter called ‘‘N’’ for defining the number of pixels in UMAP.

圖片.png

不僅僅有數(shù)據(jù)處理吵护，還有圖片處理的相關(guān)信息

大家不妨試一試吧

生活很好，有你更好

最后編輯于：2021.04.28 10:50:39

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個濱河市馅而，隨后出現(xiàn)的幾起案子祥诽，更是在濱河造成了極大的恐慌，老刑警劉巖瓮恭，帶你破解...
沈念sama閱讀 217,185評論 6贊 503
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件雄坪，死亡現(xiàn)場離奇詭異，居然都是意外死亡屯蹦，警方通過查閱死者的電腦和手機(jī)维哈，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 92,652評論 3贊 393
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來登澜，“玉大人阔挠，你說我怎么就攤上這事√” “怎么了谒亦？”我有些...
開封第一講書人閱讀 163,524評論 0贊 353
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長空郊。經(jīng)常有香客問我份招，道長，這世上最難降的妖魔是什么狞甚？我笑而不...
開封第一講書人閱讀 58,339評論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任锁摔，我火速辦了婚禮，結(jié)果婚禮上哼审，老公的妹妹穿的比我還像新娘谐腰。我一直安慰自己，他們只是感情好涩盾，可當(dāng)我...
茶點(diǎn)故事閱讀 67,387評論 6贊 391
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布十气。她就那樣靜靜地躺著，像睡著了一般春霍。火紅的嫁衣襯著肌膚如雪砸西。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 51,287評論 1贊 301
城市分裂傳說
那天址儒，我揣著相機(jī)與錄音芹枷，去河邊找鬼。笑死莲趣，一個胖子當(dāng)著我的面吹牛鸳慈，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播喧伞，決...
沈念sama閱讀 40,130評論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼走芋，長吁一口氣：“原來是場噩夢啊……” “哼绩郎！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起翁逞，我...
開封第一講書人閱讀 38,985評論 0贊 275
萬榮殺人案實(shí)錄
序言：老撾萬榮一對情侶失蹤嗽上，失蹤者是張志新（化名）和其女友劉穎，沒想到半個月后熄攘，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 45,420評論 1贊 313
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡彼念，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 37,617評論 3贊 334
?白月光啟示錄
正文我和宋清朗相戀三年挪圾，在試婚紗的時候發(fā)現(xiàn)自己被綠了。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片逐沙。...
茶點(diǎn)故事閱讀 39,779評論 1贊 348
活死人
序言：一個原本活蹦亂跳的男人離奇死亡哲思，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出吩案，到底是詐尸還是另有隱情棚赔，我是刑警寧澤，帶...
沈念sama閱讀 35,477評論 5贊 345
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布徘郭，位于F島的核電站靠益，受9級特大地震影響，放射性物質(zhì)發(fā)生泄漏残揉。R本人自食惡果不足惜胧后，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,088評論 3贊 328
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望抱环。院中可真熱鬧壳快，春花似錦、人聲如沸镇草。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,716評論 0贊 22
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽梯啤。三九已至竖伯，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間条辟，已是汗流浹背黔夭。一陣腳步聲響...
開封第一講書人閱讀 32,857評論 1贊 269
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留羽嫡，地道東北人本姥。一個月前我還...
沈念sama閱讀 47,876評論 2贊 370
代替公主和親
正文我出身青樓，卻偏偏與公主長得像杭棵，于是被迫代替她去往敵國和親婚惫。傳聞我的和親對象是個殘疾皇子氛赐，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 44,700評論 2贊 354

10X單細(xì)胞（10X空間轉(zhuǎn)錄組）軌跡分析（擬時分析）VECTOR之文獻(xiàn)分享

SUMMARY

INTRODUCTION

這個地方我們提煉一下

看一下RNA velocyto的缺點(diǎn)

Result

第一步是拿定義好的兩個單細(xì)胞數(shù)據(jù)集驗證軟件的可靠性

UMAP直接推斷軌跡發(fā)生申鱼，這個在monocle3軟件中有運(yùn)用

運(yùn)用示例

Vector 和 RNA Velocyto的比較

接下來是運(yùn)用到多發(fā)育分支的數(shù)據(jù)

Method

The workflow of VECTOR

推薦閱讀更多精彩內(nèi)容