MOVICS系列教程(三) RUN Module

前言

通過學習前面幾個模塊癌淮,我們已經(jīng)發(fā)現(xiàn)了基于多組學數(shù)據(jù)找出的乳腺癌各亞型間具有非常顯著的分子差異髓需,而我們?nèi)绻肷钊胪诰蚱浔澈髾C制,就需要找出差異表達的基因是哪些,以及這些基因具有什么樣的功能亏推。而MOVICS第三個模塊颜懊,也就是最后一個模塊,將會幫助我們實現(xiàn)這個目的避诽。

這一模塊的內(nèi)容需要有一定生物學背景知識才能較好的理解龟虎,大家可以通過生信往期的一些推文來進行學習。


主要函數(shù)

這一部分的函數(shù)主要是行駛差異分析以及富集分析的功能沙庐,MOVICS內(nèi)置了多種常用分析的函數(shù)鲤妥,大家使用時可以根據(jù)需要進行選擇佳吞。

runDEA(): run differential expression analysis with three popular methods for choosing, including edgeR, DESeq2, and limma runMarker(): run biomarker identification to determine uniquely and significantly differential expressed genes for each subtype

runGSEA(): run gene set enrichment analysis (GSEA), calculate activity of functional pathways and generate a pathway-specific heatmap

runGSVA(): run gene set variation analysis to calculate enrichment score of each sample based on given gene set list of interest

runNTP(): run nearest template prediction based on identified biomarkers to evaluate subtypes in external cohorts

runPAM(): run partition around medoids classifier based on discovery cohort to predict subtypes in external cohorts

runKappa(): run consistency evaluation using Kappa statistics between two appraisements that identify or predict current subtypes


代碼演示

下面Immugent將進行實操演示:

# run DEA with edgeR
runDEA(dea.method = "edger",
       expr       = count, # raw count data
       moic.res   = cmoic.brca,
       prefix     = "TCGA-BRCA") # prefix of figure name
#> --all samples matched.
#> --you choose edger and please make sure an RNA-Seq count data was provided.
#> edger of CS1_vs_Others done...
#> edger of CS2_vs_Others done...
#> edger of CS3_vs_Others done...
#> edger of CS4_vs_Others done...
#> edger of CS5_vs_Others done...

# run DEA with DESeq2
runDEA(dea.method = "deseq2",
       expr       = count,
       moic.res   = cmoic.brca,
       prefix     = "TCGA-BRCA")
#> --all samples matched.
#> --you choose deseq2 and please make sure an RNA-Seq count data was provided.
#> deseq2 of CS1_vs_Others done...
#> deseq2 of CS2_vs_Others done...
#> deseq2 of CS3_vs_Others done...
#> deseq2 of CS4_vs_Others done...
#> deseq2 of CS5_vs_Others done...

# run DEA with limma
runDEA(dea.method = "limma",
       expr       = fpkm, # normalized expression data
       moic.res   = cmoic.brca,
       prefix     = "TCGA-BRCA")

算出差異基因后,做個熱圖來看一下棉安。

# choose edgeR result to identify subtype-specific up-regulated biomarkers
marker.up <- runMarker(moic.res      = cmoic.brca,
                       dea.method    = "edger", # name of DEA method
                       prefix        = "TCGA-BRCA", # MUST be the same of argument in runDEA()
                       dat.path      = getwd(), # path of DEA files
                       res.path      = getwd(), # path to save marker files
                       p.cutoff      = 0.05, # p cutoff to identify significant DEGs
                       p.adj.cutoff  = 0.05, # padj cutoff to identify significant DEGs
                       dirct         = "up", # direction of dysregulation in expression
                       n.marker      = 100, # number of biomarkers for each subtype
                       doplot        = TRUE, # generate diagonal heatmap
                       norm.expr     = fpkm, # use normalized expression as heatmap input
                       annCol        = annCol, # sample annotation in heatmap
                       annColors     = annColors, # colors for sample annotation
                       show_rownames = FALSE, # show no rownames (biomarker name)
                       fig.name      = "UPREGULATED BIOMARKER HEATMAP")
圖片
# MUST locate ABSOLUTE path of msigdb file
MSIGDB.FILE <- system.file("extdata", "c5.bp.v7.1.symbols.xls", package = "MOVICS", mustWork = TRUE)

# run GSEA to identify up-regulated GO pathways using results from edgeR
gsea.up <- runGSEA(moic.res     = cmoic.brca,
                   dea.method   = "edger", # name of DEA method
                   prefix       = "TCGA-BRCA", # MUST be the same of argument in runDEA()
                   dat.path     = getwd(), # path of DEA files
                   res.path     = getwd(), # path to save GSEA files
                   msigdb.path  = MSIGDB.FILE, # MUST be the ABSOLUTE path of msigdb file
                   norm.expr    = fpkm, # use normalized expression to calculate enrichment score
                   dirct        = "up", # direction of dysregulation in pathway
                   p.cutoff     = 0.05, # p cutoff to identify significant pathways
                   p.adj.cutoff = 0.25, # padj cutoff to identify significant pathways
                   gsva.method  = "gsva", # method to calculate single sample enrichment score
                   norm.method  = "mean", # normalization method to calculate subtype-specific enrichment score
                   fig.name     = "UPREGULATED PATHWAY HEATMAP")
圖片

除了GSEA分析還能進行GSVA分析底扳。

# MUST locate ABSOLUTE path of gene set file
GSET.FILE <- 
  system.file("extdata", "gene sets of interest.gmt", package = "MOVICS", mustWork = TRUE)
圖片
# run NTP in Yau cohort by using up-regulated biomarkers
yau.ntp.pred <- runNTP(expr       = brca.yau$mRNA.expr,
                       templates  = marker.up$templates, # the template has been already prepared in runMarker()
                       scaleFlag  = TRUE, # scale input data (by default)
                       centerFlag = TRUE, # center input data (by default)
                       doPlot     = TRUE, # to generate heatmap
                       fig.name   = "NTP HEATMAP FOR YAU") 
圖片
# compare survival outcome in Yau cohort
surv.yau <- compSurv(moic.res         = yau.ntp.pred,
                     surv.info        = brca.yau$clin.info,
                     convt.time       = "m", # switch to month
                     surv.median.line = "hv", # switch to both
                     fig.name         = "KAPLAN-MEIER CURVE OF NTP FOR YAU") 
圖片
# compare agreement in Yau cohort
agree.yau <- compAgree(moic.res  = yau.ntp.pred,
                       subt2comp = brca.yau$clin.info[, "PAM50", drop = FALSE],
                       doPlot    = TRUE,
                       fig.name  = "YAU PREDICTEDMOIC WITH PAM50")
圖片

除了NTP之外,MOVICS還提供了另一種無模型方法來預測亞型贡耽。具體來說花盐,runPAM()首先在發(fā)現(xiàn)訓練集(即TCGA-BRCA)中對medoids (PAM)分類器進行分區(qū)訓練,以預測外部驗證測試集(即BRCA-Yau)中患者的亞型菇爪,驗證隊列中的每個樣本被分配到一個子類型標簽,其質(zhì)心與樣本的皮爾遜相關性最高柒昏。最后凳宙,將執(zhí)行組內(nèi)比例(IGP)統(tǒng)計來評估發(fā)現(xiàn)隊列和驗證隊列之間獲得的亞型的相似性和重現(xiàn)性。

yau.pam.pred <- runPAM(train.expr  = fpkm,
                       moic.res    = cmoic.brca,
                       test.expr   = brca.yau$mRNA.expr)
                       
# predict subtype in discovery cohort using NTP
tcga.ntp.pred <- runNTP(expr      = fpkm,
                        templates = marker.up$templates,
                        doPlot    = FALSE) 
#> --original template has 500 biomarkers and 500 are matched in external expression profile.
#> cosine correlation distance
#> 643 samples; 5 classes; 100-100 features/class
#> serial processing; 1000 permutation(s)...
#> predicted samples/class (FDR<0.05)
#> 
#>  CS1  CS2  CS3  CS4  CS5 <NA> 
#>   99  105  138  155  107   39

# predict subtype in discovery cohort using PAM
tcga.pam.pred <- runPAM(train.expr  = fpkm,
                        moic.res    = cmoic.brca,
                        test.expr   = fpkm)
#> --all samples matched.
#> --a total of 13771 genes shared and used.
#> --log2 transformation done for training expression data.
#> --log2 transformation done for testing expression data.

# check consistency between current and NTP-predicted subtype in discovery TCGA-BRCA
runKappa(subt1     = cmoic.brca$clust.res$clust,
         subt2     = tcga.ntp.pred$clust.res$clust,
         subt1.lab = "CMOIC",
         subt2.lab = "NTP",
         fig.name  = "CONSISTENCY HEATMAP FOR TCGA between CMOIC and NTP")

# check consistency between current and PAM-predicted subtype in discovery TCGA-BRCA
runKappa(subt1     = cmoic.brca$clust.res$clust,
         subt2     = tcga.pam.pred$clust.res$clust,
         subt1.lab = "CMOIC",
         subt2.lab = "PAM",
         fig.name  = "CONSISTENCY HEATMAP FOR TCGA between CMOIC and PAM")

# check consistency between NTP and PAM-predicted subtype in validation Yau-BRCA
runKappa(subt1     = yau.ntp.pred$clust.res$clust,
         subt2     = yau.pam.pred$clust.res$clust,
         subt1.lab = "NTP",
         subt2.lab = "PAM",
         fig.name  = "CONSISTENCY HEATMAP FOR YAU")                  
圖片

總結(jié)

截止到這篇推文职祷,有關MOVICS包的所有實操部分內(nèi)容都已結(jié)束氏涩,本來對這個R包的所有介紹內(nèi)容都應該完結(jié)了。

但是Immugent本著對粉絲極度負責任的態(tài)度有梆,后續(xù)專門會解讀一篇完全依靠這個R包來分析的SCI文章是尖,直到教會大家如何使用MOVICS包分析自己的數(shù)據(jù)為止,敬請期待泥耀!

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末饺汹,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子痰催,更是在濱河造成了極大的恐慌兜辞,老刑警劉巖,帶你破解...
    沈念sama閱讀 221,635評論 6 515
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件夸溶,死亡現(xiàn)場離奇詭異逸吵,居然都是意外死亡,警方通過查閱死者的電腦和手機缝裁,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 94,543評論 3 399
  • 文/潘曉璐 我一進店門扫皱,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人捷绑,你說我怎么就攤上這事韩脑。” “怎么了胎食?”我有些...
    開封第一講書人閱讀 168,083評論 0 360
  • 文/不壞的土叔 我叫張陵扰才,是天一觀的道長。 經(jīng)常有香客問我厕怜,道長衩匣,這世上最難降的妖魔是什么蕾总? 我笑而不...
    開封第一講書人閱讀 59,640評論 1 296
  • 正文 為了忘掉前任,我火速辦了婚禮琅捏,結(jié)果婚禮上生百,老公的妹妹穿的比我還像新娘。我一直安慰自己柄延,他們只是感情好蚀浆,可當我...
    茶點故事閱讀 68,640評論 6 397
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著搜吧,像睡著了一般市俊。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上滤奈,一...
    開封第一講書人閱讀 52,262評論 1 308
  • 那天摆昧,我揣著相機與錄音,去河邊找鬼蜒程。 笑死绅你,一個胖子當著我的面吹牛,可吹牛的內(nèi)容都是我干的昭躺。 我是一名探鬼主播忌锯,決...
    沈念sama閱讀 40,833評論 3 421
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼领炫!你這毒婦竟也來了偶垮?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 39,736評論 0 276
  • 序言:老撾萬榮一對情侶失蹤帝洪,失蹤者是張志新(化名)和其女友劉穎针史,沒想到半個月后,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體碟狞,經(jīng)...
    沈念sama閱讀 46,280評論 1 319
  • 正文 獨居荒郊野嶺守林人離奇死亡啄枕,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 38,369評論 3 340
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發(fā)現(xiàn)自己被綠了族沃。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片频祝。...
    茶點故事閱讀 40,503評論 1 352
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖脆淹,靈堂內(nèi)的尸體忽然破棺而出常空,到底是詐尸還是另有隱情,我是刑警寧澤盖溺,帶...
    沈念sama閱讀 36,185評論 5 350
  • 正文 年R本政府宣布漓糙,位于F島的核電站,受9級特大地震影響烘嘱,放射性物質(zhì)發(fā)生泄漏昆禽。R本人自食惡果不足惜蝗蛙,卻給世界環(huán)境...
    茶點故事閱讀 41,870評論 3 333
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望醉鳖。 院中可真熱鬧捡硅,春花似錦、人聲如沸盗棵。這莊子的主人今日做“春日...
    開封第一講書人閱讀 32,340評論 0 24
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽纹因。三九已至喷屋,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間瞭恰,已是汗流浹背逼蒙。 一陣腳步聲響...
    開封第一講書人閱讀 33,460評論 1 272
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留寄疏,地道東北人。 一個月前我還...
    沈念sama閱讀 48,909評論 3 376
  • 正文 我出身青樓僵井,卻偏偏與公主長得像陕截,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子批什,可洞房花燭夜當晚...
    茶點故事閱讀 45,512評論 2 359

推薦閱讀更多精彩內(nèi)容