MOVICS系列教程(三) RUN Module

前言

通過學習前面幾個模塊癌淮，我們已經(jīng)發(fā)現(xiàn)了基于多組學數(shù)據(jù)找出的乳腺癌各亞型間具有非常顯著的分子差異髓需，而我們?nèi)绻肷钊胪诰蚱浔澈髾C制，就需要找出差異表達的基因是哪些,以及這些基因具有什么樣的功能亏推。而MOVICS第三個模塊颜懊，也就是最后一個模塊，將會幫助我們實現(xiàn)這個目的避诽。

這一模塊的內(nèi)容需要有一定生物學背景知識才能較好的理解龟虎，大家可以通過生信往期的一些推文來進行學習。

主要函數(shù)

這一部分的函數(shù)主要是行駛差異分析以及富集分析的功能沙庐，MOVICS內(nèi)置了多種常用分析的函數(shù)鲤妥，大家使用時可以根據(jù)需要進行選擇佳吞。

runDEA(): run differential expression analysis with three popular methods for choosing, including edgeR, DESeq2, and limma runMarker(): run biomarker identification to determine uniquely and significantly differential expressed genes for each subtype

runGSEA(): run gene set enrichment analysis (GSEA), calculate activity of functional pathways and generate a pathway-specific heatmap

runGSVA(): run gene set variation analysis to calculate enrichment score of each sample based on given gene set list of interest

runNTP(): run nearest template prediction based on identified biomarkers to evaluate subtypes in external cohorts

runPAM(): run partition around medoids classifier based on discovery cohort to predict subtypes in external cohorts

runKappa(): run consistency evaluation using Kappa statistics between two appraisements that identify or predict current subtypes

代碼演示

下面Immugent將進行實操演示：

# run DEA with edgeR
runDEA(dea.method = "edger",
       expr       = count, # raw count data
       moic.res   = cmoic.brca,
       prefix     = "TCGA-BRCA") # prefix of figure name
#> --all samples matched.
#> --you choose edger and please make sure an RNA-Seq count data was provided.
#> edger of CS1_vs_Others done...
#> edger of CS2_vs_Others done...
#> edger of CS3_vs_Others done...
#> edger of CS4_vs_Others done...
#> edger of CS5_vs_Others done...

# run DEA with DESeq2
runDEA(dea.method = "deseq2",
       expr       = count,
       moic.res   = cmoic.brca,
       prefix     = "TCGA-BRCA")
#> --all samples matched.
#> --you choose deseq2 and please make sure an RNA-Seq count data was provided.
#> deseq2 of CS1_vs_Others done...
#> deseq2 of CS2_vs_Others done...
#> deseq2 of CS3_vs_Others done...
#> deseq2 of CS4_vs_Others done...
#> deseq2 of CS5_vs_Others done...

# run DEA with limma
runDEA(dea.method = "limma",
       expr       = fpkm, # normalized expression data
       moic.res   = cmoic.brca,
       prefix     = "TCGA-BRCA")

算出差異基因后，做個熱圖來看一下棉安。

# choose edgeR result to identify subtype-specific up-regulated biomarkers
marker.up <- runMarker(moic.res      = cmoic.brca,
                       dea.method    = "edger", # name of DEA method
                       prefix        = "TCGA-BRCA", # MUST be the same of argument in runDEA()
                       dat.path      = getwd(), # path of DEA files
                       res.path      = getwd(), # path to save marker files
                       p.cutoff      = 0.05, # p cutoff to identify significant DEGs
                       p.adj.cutoff  = 0.05, # padj cutoff to identify significant DEGs
                       dirct         = "up", # direction of dysregulation in expression
                       n.marker      = 100, # number of biomarkers for each subtype
                       doplot        = TRUE, # generate diagonal heatmap
                       norm.expr     = fpkm, # use normalized expression as heatmap input
                       annCol        = annCol, # sample annotation in heatmap
                       annColors     = annColors, # colors for sample annotation
                       show_rownames = FALSE, # show no rownames (biomarker name)
                       fig.name      = "UPREGULATED BIOMARKER HEATMAP")

圖片

# MUST locate ABSOLUTE path of msigdb file
MSIGDB.FILE <- system.file("extdata", "c5.bp.v7.1.symbols.xls", package = "MOVICS", mustWork = TRUE)

# run GSEA to identify up-regulated GO pathways using results from edgeR
gsea.up <- runGSEA(moic.res     = cmoic.brca,
                   dea.method   = "edger", # name of DEA method
                   prefix       = "TCGA-BRCA", # MUST be the same of argument in runDEA()
                   dat.path     = getwd(), # path of DEA files
                   res.path     = getwd(), # path to save GSEA files
                   msigdb.path  = MSIGDB.FILE, # MUST be the ABSOLUTE path of msigdb file
                   norm.expr    = fpkm, # use normalized expression to calculate enrichment score
                   dirct        = "up", # direction of dysregulation in pathway
                   p.cutoff     = 0.05, # p cutoff to identify significant pathways
                   p.adj.cutoff = 0.25, # padj cutoff to identify significant pathways
                   gsva.method  = "gsva", # method to calculate single sample enrichment score
                   norm.method  = "mean", # normalization method to calculate subtype-specific enrichment score
                   fig.name     = "UPREGULATED PATHWAY HEATMAP")

圖片

除了GSEA分析還能進行GSVA分析底扳。

# MUST locate ABSOLUTE path of gene set file
GSET.FILE <- 
  system.file("extdata", "gene sets of interest.gmt", package = "MOVICS", mustWork = TRUE)

圖片

# run NTP in Yau cohort by using up-regulated biomarkers
yau.ntp.pred <- runNTP(expr       = brca.yau$mRNA.expr,
                       templates  = marker.up$templates, # the template has been already prepared in runMarker()
                       scaleFlag  = TRUE, # scale input data (by default)
                       centerFlag = TRUE, # center input data (by default)
                       doPlot     = TRUE, # to generate heatmap
                       fig.name   = "NTP HEATMAP FOR YAU")

圖片

# compare survival outcome in Yau cohort
surv.yau <- compSurv(moic.res         = yau.ntp.pred,
                     surv.info        = brca.yau$clin.info,
                     convt.time       = "m", # switch to month
                     surv.median.line = "hv", # switch to both
                     fig.name         = "KAPLAN-MEIER CURVE OF NTP FOR YAU")

圖片

# compare agreement in Yau cohort
agree.yau <- compAgree(moic.res  = yau.ntp.pred,
                       subt2comp = brca.yau$clin.info[, "PAM50", drop = FALSE],
                       doPlot    = TRUE,
                       fig.name  = "YAU PREDICTEDMOIC WITH PAM50")

圖片

除了NTP之外，MOVICS還提供了另一種無模型方法來預測亞型贡耽。具體來說花盐，runPAM()首先在發(fā)現(xiàn)訓練集(即TCGA-BRCA)中對medoids (PAM)分類器進行分區(qū)訓練，以預測外部驗證測試集(即BRCA-Yau)中患者的亞型菇爪，驗證隊列中的每個樣本被分配到一個子類型標簽，其質(zhì)心與樣本的皮爾遜相關性最高柒昏。最后凳宙，將執(zhí)行組內(nèi)比例(IGP)統(tǒng)計來評估發(fā)現(xiàn)隊列和驗證隊列之間獲得的亞型的相似性和重現(xiàn)性。

yau.pam.pred <- runPAM(train.expr  = fpkm,
                       moic.res    = cmoic.brca,
                       test.expr   = brca.yau$mRNA.expr)
                       
# predict subtype in discovery cohort using NTP
tcga.ntp.pred <- runNTP(expr      = fpkm,
                        templates = marker.up$templates,
                        doPlot    = FALSE) 
#> --original template has 500 biomarkers and 500 are matched in external expression profile.
#> cosine correlation distance
#> 643 samples; 5 classes; 100-100 features/class
#> serial processing; 1000 permutation(s)...
#> predicted samples/class (FDR<0.05)
#> 
#>  CS1  CS2  CS3  CS4  CS5 <NA> 
#>   99  105  138  155  107   39

# predict subtype in discovery cohort using PAM
tcga.pam.pred <- runPAM(train.expr  = fpkm,
                        moic.res    = cmoic.brca,
                        test.expr   = fpkm)
#> --all samples matched.
#> --a total of 13771 genes shared and used.
#> --log2 transformation done for training expression data.
#> --log2 transformation done for testing expression data.

# check consistency between current and NTP-predicted subtype in discovery TCGA-BRCA
runKappa(subt1     = cmoic.brca$clust.res$clust,
         subt2     = tcga.ntp.pred$clust.res$clust,
         subt1.lab = "CMOIC",
         subt2.lab = "NTP",
         fig.name  = "CONSISTENCY HEATMAP FOR TCGA between CMOIC and NTP")

# check consistency between current and PAM-predicted subtype in discovery TCGA-BRCA
runKappa(subt1     = cmoic.brca$clust.res$clust,
         subt2     = tcga.pam.pred$clust.res$clust,
         subt1.lab = "CMOIC",
         subt2.lab = "PAM",
         fig.name  = "CONSISTENCY HEATMAP FOR TCGA between CMOIC and PAM")

# check consistency between NTP and PAM-predicted subtype in validation Yau-BRCA
runKappa(subt1     = yau.ntp.pred$clust.res$clust,
         subt2     = yau.pam.pred$clust.res$clust,
         subt1.lab = "NTP",
         subt2.lab = "PAM",
         fig.name  = "CONSISTENCY HEATMAP FOR YAU")

圖片

總結(jié)

截止到這篇推文职祷，有關MOVICS包的所有實操部分內(nèi)容都已結(jié)束氏涩，本來對這個R包的所有介紹內(nèi)容都應該完結(jié)了。

但是Immugent本著對粉絲極度負責任的態(tài)度有梆，后續(xù)專門會解讀一篇完全依靠這個R包來分析的SCI文章是尖，直到教會大家如何使用MOVICS包分析自己的數(shù)據(jù)為止，敬請期待泥耀！

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末饺汹，一起剝皮案震驚了整個濱河市，隨后出現(xiàn)的幾起案子痰催，更是在濱河造成了極大的恐慌兜辞，老刑警劉巖，帶你破解...
沈念sama閱讀 221,635評論 6贊 515
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件夸溶，死亡現(xiàn)場離奇詭異逸吵，居然都是意外死亡，警方通過查閱死者的電腦和手機缝裁，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 94,543評論 3贊 399
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門扫皱，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人捷绑，你說我怎么就攤上這事韩脑。” “怎么了胎食？”我有些...
開封第一講書人閱讀 168,083評論 0贊 360
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵扰才，是天一觀的道長。經(jīng)常有香客問我厕怜，道長衩匣，這世上最難降的妖魔是什么蕾总？我笑而不...
開封第一講書人閱讀 59,640評論 1贊 296
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮琅捏，結(jié)果婚禮上生百，老公的妹妹穿的比我還像新娘。我一直安慰自己柄延，他們只是感情好蚀浆，可當我...
茶點故事閱讀 68,640評論 6贊 397
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著搜吧，像睡著了一般市俊。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上滤奈，一...
開封第一講書人閱讀 52,262評論 1贊 308
城市分裂傳說
那天摆昧，我揣著相機與錄音，去河邊找鬼蜒程。笑死绅你，一個胖子當著我的面吹牛，可吹牛的內(nèi)容都是我干的昭躺。我是一名探鬼主播忌锯，決...
沈念sama閱讀 40,833評論 3贊 421
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼领炫！你這毒婦竟也來了偶垮？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 39,736評論 0贊 276
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤帝洪，失蹤者是張志新（化名）和其女友劉穎针史，沒想到半個月后，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體碟狞，經(jīng)...
沈念sama閱讀 46,280評論 1贊 319
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡啄枕，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 38,369評論 3贊 340
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現(xiàn)自己被綠了族沃。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片频祝。...
茶點故事閱讀 40,503評論 1贊 352
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖脆淹，靈堂內(nèi)的尸體忽然破棺而出常空，到底是詐尸還是另有隱情，我是刑警寧澤盖溺，帶...
沈念sama閱讀 36,185評論 5贊 350
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布漓糙，位于F島的核電站，受9級特大地震影響烘嘱，放射性物質(zhì)發(fā)生泄漏昆禽。R本人自食惡果不足惜蝗蛙，卻給世界環(huán)境...
茶點故事閱讀 41,870評論 3贊 333
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望醉鳖。院中可真熱鬧捡硅，春花似錦、人聲如沸盗棵。這莊子的主人今日做“春日...
開封第一講書人閱讀 32,340評論 0贊 24
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽纹因。三九已至喷屋，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間瞭恰，已是汗流浹背逼蒙。一陣腳步聲響...
開封第一講書人閱讀 33,460評論 1贊 272
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留寄疏，地道東北人。一個月前我還...
沈念sama閱讀 48,909評論 3贊 376
代替公主和親
正文我出身青樓僵井，卻偏偏與公主長得像陕截，于是被迫代替她去往敵國和親。傳聞我的和親對象是個殘疾皇子批什，可洞房花燭夜當晚...
茶點故事閱讀 45,512評論 2贊 359

MOVICS系列教程(三) RUN Module

前言

主要函數(shù)

代碼演示

總結(jié)

推薦閱讀更多精彩內(nèi)容