使用MetaboDiff包分析非靶向代謝組數(shù)據(jù)

最近手里有個(gè)非靶向代謝組的數(shù)據(jù)苔巨，通過學(xué)習(xí)MetaboDiff包來熟悉代謝組分析的思路和流程，接下來的流程來自于MetaboDiff包官方幫助文檔。

1. MetaboDiff包安裝

library("devtools")
install_github("andreasmock/MetaboDiff")
library(MetaboDiff)

2. 數(shù)據(jù)處理

2.1數(shù)據(jù)的導(dǎo)入

MetaboDiff包需要三個(gè)數(shù)據(jù)：

assay - 包含代謝物的相對豐度的數(shù)據(jù)矩陣；
rowData -包含代謝物注釋信息的數(shù)據(jù) 框涣雕；
colData - 包含樣本元數(shù)據(jù)的數(shù)據(jù)框。

MetaboDiff包自帶的示例數(shù)據(jù)來自于這篇文獻(xiàn)AKT1 and MYC Induce Distinctive Metabolic Fingerprints in Human Prostate Cancer闭翩。代謝組數(shù)據(jù)來自于61個(gè)前列腺癌病人和25個(gè)正常人的前列腺組織挣郭。
先查看一下這個(gè)三個(gè)數(shù)據(jù)。

> assay[1:5,1:5]
         pat1      pat2      pat3     pat4      pat5
met1 33964.73 117318.43 118856.90  78670.7 102565.94
met2 18505.56 167585.32  59621.97  66220.4  74892.27
met3       NA  42373.93  27141.21       NA  38390.78
met4 61638.77  74595.78        NA       NA        NA
met5       NA 148363.61  43861.79 105835.2  25589.08

> head(colData)
       id tumor_normal random_gender   group
pat1  cp2            N        female Control
pat2  cp7            N        female Control
pat3 cp19            N          male Control
pat4 cp26            N          male Control
pat5 cp29            N        female Control
pat6 cp32            N          male Control

> head(rowData)
                                    BIOCHEMICAL    SUPER_PATHWAY      SUB_PATHWAY METABOLON_ID
met1  1-arachidonoylglycerophosphoethanolamine*            Lipid        Lysolipid        35186
met2      1-arachidonoylglycerophosphoinositol*            Lipid        Lysolipid        34214
met3                      1-arachidonylglycerol            Lipid Monoacylglycerol        34397
met4      1-eicosadienoylglycerophosphocholine*            Lipid        Lysolipid        33871
met5 1-heptadecanoylglycerophosphoethanolamine* No Super Pathway       No Pathway        37419
met6       1-linoleoylglycerol (1-monolinolein)            Lipid Monoacylglycerol        27447
      PLATFORM KEGG_ID   HMDB_ID
met1 LC/MS neg    <NA> HMDB11517
met2 LC/MS neg    <NA>      <NA>
met3 LC/MS neg  C13857 HMDB11572
met4 LC/MS pos    <NA>      <NA>
met5 LC/MS neg    <NA>      <NA>
met6 LC/MS neg    <NA>      <NA>

#將三個(gè)數(shù)據(jù)集融合成一個(gè)以便于下游分析疗韵。
> (met <- create_mae(assay,rowData,colData))
A MultiAssayExperiment object of 1 listed
 experiment with a user-defined name and respective class. 
 Containing an ExperimentList class object of length 1: 
 [1] raw: SummarizedExperiment with 307 rows and 86 columns 
Features: 
 experiments() - obtain the ExperimentList instance 
 colData() - the primary/phenotype DataFrame 
 sampleMap() - the sample availability DataFrame 
 `$`, `[`, `[[` - extract colData columns, subset, or experiment 
 *Format() - convert into a long or wide DataFrame 
 assays() - convert ExperimentList to a SimpleList of matrices

2.2 代謝物的注釋

如果HMDB兑障、KEGG或ChEBI id是rowData數(shù)據(jù)集的一部分，則可以從小分子通路數(shù)據(jù)庫(SMPDB)檢索進(jìn)行代謝產(chǎn)物注釋蕉汪。

> met <- get_SMPDBanno(met,
+                           column_kegg_id=6,
+                           column_hmdb_id=7,
+                           column_chebi_id=NA)

2.3 處理缺失值

> na_heatmap(met,
+            group_factor="tumor_normal",
+            label_colors=c("darkseagreen","dodgerblue"))

#剔除缺失值流译，計(jì)算代謝物的相對豐度。
> (met = knn_impute(met,cutoff=0.4))
A MultiAssayExperiment object of 2 listed
 experiments with user-defined names and respective classes. 
 Containing an ExperimentList class object of length 2: 
 [1] raw: SummarizedExperiment with 307 rows and 86 columns 
 [2] imputed: SummarizedExperiment with 238 rows and 86 columns 
Features: 
 experiments() - obtain the ExperimentList instance 
 colData() - the primary/phenotype DataFrame 
 sampleMap() - the sample availability DataFrame 
 `$`, `[`, `[[` - extract colData columns, subset, or experiment 
 *Format() - convert into a long or wide DataFrame 
 assays() - convert ExperimentList to a SimpleList of matrices

2.4 異常值熱圖

在標(biāo)準(zhǔn)化數(shù)據(jù)之前者疤，我們需要剔除數(shù)據(jù)中的異常值福澡。

> outlier_heatmap(met,
+                 group_factor="tumor_normal",
+                 label_colors=c("darkseagreen","dodgerblue"),
+                 k=2)

根據(jù)上述熱圖，設(shè)置了k=2, 熱圖形成了cluster1和cluster2宛渐，cluster1相對cluster2便是異常值，我們將剔除cluster1眯搭。

> (met <- remove_cluster(met,cluster=1))
harmonizing input:
  removing 5 sampleMap rows with 'colname' not in colnames of experiments
harmonizing input:
  removing 5 sampleMap rows with 'colname' not in colnames of experiments
  removing 5 colData rownames not in sampleMap 'primary'
A MultiAssayExperiment object of 2 listed
 experiments with user-defined names and respective classes. 
 Containing an ExperimentList class object of length 2: 
 [1] raw: SummarizedExperiment with 307 rows and 81 columns 
 [2] imputed: SummarizedExperiment with 238 rows and 81 columns 
Features: 
 experiments() - obtain the ExperimentList instance 
 colData() - the primary/phenotype DataFrame 
 sampleMap() - the sample availability DataFrame 
 `$`, `[`, `[[` - extract colData columns, subset, or experiment 
 *Format() - convert into a long or wide DataFrame 
 assays() - convert ExperimentList to a SimpleList of matrices

2.5 數(shù)據(jù)標(biāo)準(zhǔn)化

> (met <- normalize_met(met))
vsn2: 307 x 81 matrix (1 stratum). 
Please use 'meanSdPlot' to verify the fit.
vsn2: 238 x 81 matrix (1 stratum). 
Please use 'meanSdPlot' to verify the fit.
A MultiAssayExperiment object of 4 listed
 experiments with user-defined names and respective classes. 
 Containing an ExperimentList class object of length 4: 
 [1] raw: SummarizedExperiment with 307 rows and 81 columns 
 [2] imputed: SummarizedExperiment with 238 rows and 81 columns 
 [3] norm: SummarizedExperiment with 307 rows and 81 columns 
 [4] norm_imputed: SummarizedExperiment with 238 rows and 81 columns 
Features: 
 experiments() - obtain the ExperimentList instance 
 colData() - the primary/phenotype DataFrame 
 sampleMap() - the sample availability DataFrame 
 `$`, `[`, `[[` - extract colData columns, subset, or experiment 
 *Format() - convert into a long or wide DataFrame 
 assays() - convert ExperimentList to a SimpleList of matrices

2.6 數(shù)據(jù)標(biāo)準(zhǔn)化質(zhì)控

> quality_plot(met,
+              group_factor="tumor_normal",
+              label_colors=c("darkseagreen","dodgerblue"))
harmonizing input:
  removing 243 sampleMap rows not in names(experiments)
harmonizing input:
  removing 243 sampleMap rows not in names(experiments)
harmonizing input:
  removing 243 sampleMap rows not in names(experiments)
harmonizing input:
  removing 243 sampleMap rows not in names(experiments)
Warning messages:
1: Removed 5356 rows containing non-finite values (stat_boxplot). 
2: Removed 5356 rows containing non-finite values (stat_boxplot).

3. 數(shù)據(jù)分析

3.1 無監(jiān)督分析

MetaboDiff包提供了線性降維方法PCA和非線性降維方法tSNE窥翩。

> source("http://peterhaschke.com/Code/multiplot.R")
> multiplot(
+   pca_plot(met,
+            group_factor="tumor_normal",
+            label_colors=c("darkseagreen","dodgerblue")),
+   tsne_plot(met,
+             group_factor="tumor_normal",
+             label_colors=c("darkseagreen","dodgerblue")),
+   cols=2)
sigma summary: Min. : 0.486945518988849 |1st Qu. : 0.714292832194587 |Median : 0.752934663223126 |Mean : 0.75914557339073 |3rd Qu. : 0.808081774279559 |Max. : 0.939549187337462 |
Epoch: Iteration #100 error is: 18.6145995899728
Epoch: Iteration #200 error is: 1.54407709770312
Epoch: Iteration #300 error is: 1.22290267643501
Epoch: Iteration #400 error is: 1.11106327484334
Epoch: Iteration #500 error is: 1.03658104678225
Epoch: Iteration #600 error is: 0.976566767973725
Epoch: Iteration #700 error is: 0.951849496540308
Epoch: Iteration #800 error is: 0.93612964053674
Epoch: Iteration #900 error is: 0.914421902208305
Epoch: Iteration #1000 error is: 0.88283039690459

3.2 假設(shè)檢驗(yàn)

對單個(gè)代謝物進(jìn)行差異分析，主要用T檢驗(yàn)和ANOVA分析鳞仙。

> met = diff_test(met,
+                 group_factors = c("tumor_normal","random_gender"))
> str(metadata(met), max.level=2)
List of 2
 $ ttest_tumor_normal_T_vs_N         :'data.frame': 238 obs. of  3 variables:
  ..$ pval       : num [1:238] 0.0206 0.7808 0.0832 0.0432 0.5859 ...
  ..$ adj_pval   : num [1:238] 0.102 0.904 0.221 0.158 0.758 ...
  ..$ fold_change: num [1:238] 0.2872 0.0366 -0.3936 -0.5391 -0.1646 ...
 $ ttest_random_gender_male_vs_female:'data.frame': 238 obs. of  3 variables:
  ..$ pval       : num [1:238] 0.2318 0.8626 0.4048 0.0121 0.2111 ...
  ..$ adj_pval   : num [1:238] 0.83 0.959 0.862 0.386 0.83 ...
  ..$ fold_change: num [1:238] -0.1372 -0.0208 0.1742 0.607 0.3438 ...
#以tumor和normal分組進(jìn)行差異分析
> volcano_plot(met, 
+              group_factor="tumor_normal",
+              label_colors=c("darkseagreen","dodgerblue"),
+              p_adjust = FALSE)
> volcano_plot(met, 
+              group_factor="tumor_normal",
+              label_colors=c("darkseagreen","dodgerblue"),
+              p_adjust = TRUE)

#以female和male分組進(jìn)行差異分析
> par(mfrow=c(1,2))
> volcano_plot(met, 
+              group_factor="random_gender",
+              label_colors=c("brown","orange"),
+              p_adjust = FALSE)
> volcano_plot(met, 
+              group_factor="random_gender",
+              label_colors=c("brown","orange"),
+              p_adjust = TRUE)

3.3 代謝物關(guān)聯(lián)網(wǎng)絡(luò)分析

相關(guān)分析被成功應(yīng)用在比較轉(zhuǎn)錄組分析中揭示具生物學(xué)意義的模塊的變化情況寇蚊。同樣是思路也可以應(yīng)用于代謝組數(shù)據(jù)分析中。

> met_example <- met_example %>%
+   diss_matrix %>%    #構(gòu)建相異矩陣
+   identify_modules(min_module_size=5) %>%  #鑒定代謝相關(guān)模塊
+   name_modules(pathway_annotation="SUB_PATHWAY") %>%  #代謝相關(guān)模塊命名
+   calculate_MS(group_factors=c("tumor_normal","random_gender")) #根據(jù)樣本性狀計(jì)算模塊之間關(guān)聯(lián)的顯著性

alpha: 1.000000
 ..cutHeight not given, setting it to 0.991  ===>  99% of the (truncated) height range in dendro.
 ..done.
#代謝相關(guān)模塊可視化棍好，分級(jí)聚類
> WGCNA::plotDendroAndColors(metadata(met_example)$tree, 
+                            metadata(met_example)$module_color_vector, 
+                            'Module colors', 
+                            dendroLabels = FALSE, 
+                            hang = 0.03,
+                            addGuide = TRUE, 
+                            guideHang = 0.05, main='')

#代謝相關(guān)模塊可視化仗岸，各模塊直接的關(guān)系
> par(mar=c(2,2,2,2))
> ape::plot.phylo(ape::as.phylo(metadata(met_example)$METree),
+                 type = 'fan',
+                 show.tip.label = FALSE, 
+                 main='')
> ape::tiplabels(frame = 'circle',
+                col='black', 
+                text=rep('',length(unique(metadata(met_example)$modules))), 
+                bg = WGCNA::labels2colors(0:21))

#代謝相關(guān)模塊命名，可視化
> ape::plot.phylo(ape::as.phylo(metadata(met_example)$METree), cex=0.9)

#癌癥樣本和正常樣本對應(yīng)的模塊之間的關(guān)聯(lián)顯著性借笙，可視化
> MS_plot(met_example,
+         group_factor="tumor_normal",
+         p_value_cutoff=0.05,
+         p_adjust=FALSE)

#不同性別樣本對應(yīng)的模塊之間的關(guān)聯(lián)顯著性扒怖，可視化
> MS_plot(met_example,
+         group_factor="random_gender",
+         p_value_cutoff=0.05,
+         p_adjust=FALSE)

#相關(guān)模塊中單個(gè)代謝產(chǎn)物在不同樣品中的差異性檢驗(yàn)
> MOI_plot(met_example,
+          group_factor="tumor_normal",
+          MOI = 2,
+          label_colors=c("darkseagreen","dodgerblue"),
+          p_adjust = FALSE) + xlim(c(-1,8))

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個(gè)濱河市业稼，隨后出現(xiàn)的幾起案子盗痒，更是在濱河造成了極大的恐慌，老刑警劉巖低散，帶你破解...
沈念sama閱讀 216,997評(píng)論 6贊 502
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件俯邓，死亡現(xiàn)場離奇詭異骡楼，居然都是意外死亡，警方通過查閱死者的電腦和手機(jī)稽鞭，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 92,603評(píng)論 3贊 392
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門鸟整，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人朦蕴，你說我怎么就攤上這事篮条。” “怎么了梦重？”我有些...
開封第一講書人閱讀 163,359評(píng)論 0贊 353
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵兑燥，是天一觀的道長。經(jīng)常有香客問我琴拧，道長降瞳，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 58,309評(píng)論 1贊 292
?港島之戀（遺憾婚禮）
正文為了忘掉前任蚓胸，我火速辦了婚禮挣饥，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘沛膳。我一直安慰自己扔枫，他們只是感情好，可當(dāng)我...
茶點(diǎn)故事閱讀 67,346評(píng)論 6贊 390
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布锹安。她就那樣靜靜地躺著短荐，像睡著了一般。火紅的嫁衣襯著肌膚如雪叹哭。梳的紋絲不亂的頭發(fā)上忍宋，一...
開封第一講書人閱讀 51,258評(píng)論 1贊 300
城市分裂傳說
那天，我揣著相機(jī)與錄音风罩，去河邊找鬼糠排。笑死，一個(gè)胖子當(dāng)著我的面吹牛超升，可吹牛的內(nèi)容都是我干的入宦。我是一名探鬼主播，決...
沈念sama閱讀 40,122評(píng)論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼室琢，長吁一口氣：“原來是場噩夢啊……” “哼乾闰！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起盈滴，我...
開封第一講書人閱讀 38,970評(píng)論 0贊 275
萬榮殺人案實(shí)錄
序言：老撾萬榮一對情侶失蹤汹忠，失蹤者是張志新（化名）和其女友劉穎，沒想到半個(gè)月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體宽菜，經(jīng)...
沈念sama閱讀 45,403評(píng)論 1贊 313
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡谣膳，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 37,596評(píng)論 3贊 334
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了铅乡。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片继谚。...
茶點(diǎn)故事閱讀 39,769評(píng)論 1贊 348
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡，死狀恐怖阵幸，靈堂內(nèi)的尸體忽然破棺而出花履，到底是詐尸還是另有隱情，我是刑警寧澤挚赊，帶...
沈念sama閱讀 35,464評(píng)論 5贊 344
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布诡壁，位于F島的核電站，受9級(jí)特大地震影響荠割，放射性物質(zhì)發(fā)生泄漏妹卿。R本人自食惡果不足惜，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,075評(píng)論 3贊 327
男人毒藥：我在死后第九天來索命
文/蒙蒙一蔑鹦、第九天我趴在偏房一處隱蔽的房頂上張望夺克。院中可真熱鬧，春花似錦嚎朽、人聲如沸铺纽。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,705評(píng)論 0贊 22
一樁弒父案哟忍，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽狡门。三九已至，卻和暖如春锅很，著一層夾襖步出監(jiān)牢的瞬間其馏，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 32,848評(píng)論 1贊 269
情欲美人皮
我被黑心中介騙來泰國打工粗蔚，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留尝偎，地道東北人饶火。一個(gè)月前我還...
沈念sama閱讀 47,831評(píng)論 2贊 370
代替公主和親
正文我出身青樓鹏控，卻偏偏與公主長得像，于是被迫代替她去往敵國和親肤寝。傳聞我的和親對象是個(gè)殘疾皇子当辐，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 44,678評(píng)論 2贊 354