本教程原文:跟著NC學(xué)作圖 | TCGA VS EMT500數(shù)據(jù)集評分比較
文章
Genomic and microenvironmental heterogeneity shaping epithelial-to-mesenchymal trajectories in cancer
「數(shù)據(jù)代碼:」https://github.com/secrierlab/EMT
摘要
上皮-間充質(zhì)轉(zhuǎn)化(EMT)是癌癥進展的關(guān)鍵細胞過程俺抽,具有多種中間狀態(tài)铣猩,其分子特征仍然不明顯淘这。為了填補這一空白嘹狞,我們提出了一種基于轉(zhuǎn)錄組信號強有力地評估個體腫瘤中 EMT 轉(zhuǎn)化的方法蚪缀。我們應(yīng)用這種方法來探索7180個上皮起源腫瘤的 EMT 軌跡疹启,并鑒定具有預(yù)后和治療價值的三種宏觀狀態(tài)煎殷,歸因于上皮,雜交 E/M 和間充質(zhì)表型豪直。我們表明,雜種狀態(tài)是相對穩(wěn)定的弓乙,并與增加的非整倍體有關(guān)末融。我們進一步利用空間轉(zhuǎn)錄組學(xué)和單細胞數(shù)據(jù)集來探索 EMT 轉(zhuǎn)化的空間異質(zhì)性钧惧,以及腫瘤微環(huán)境中與細胞毒性勾习、 NK 細胞和成纖維細胞的不同相互作用模式。此外巧婶,我們提供了一個基因組事件的目錄潛在的不同進化約束的 EMT 轉(zhuǎn)化乾颁。這項研究揭示了 EMT 病程中不同階段的病因,并強調(diào)了形成原發(fā)性腫瘤間充質(zhì)轉(zhuǎn)化的更廣泛的基因組和環(huán)境標(biāo)志粹舵。
圖形繪制
導(dǎo)入數(shù)據(jù)
數(shù)據(jù)的前期準(zhǔn)備,大家可以自己去論文中的查找巴席,教程中不展開說明诅需。教程的主要目的「是提供大家繪制類似圖形的代碼」。
input_cp_final2 <- read.table("HighConf_pseudotime_withEMT.txt",header = T)
head(input_cp_final2)
patients tumors mock_pseudospace tgfb_pseudospace EMT_scores
ACC.TCGA.OR.A5J1.01A.11R.A29S.07 ACC 52.3901898499394 42.4277905190899 1.23229228944451
ACC.TCGA.OR.A5J2.01A.11R.A29S.07 ACC 58.2219383281126 52.1312432022196 1.49097707952493
ACC.TCGA.OR.A5J3.01A.11R.A29S.07 ACC 51.5891775929671 41.0837306018038 1.36483064051018
ACC.TCGA.OR.A5J5.01A.11R.A29S.07 ACC 51.963211277127 46.4675216826843 1.79346466983594
ACC.TCGA.OR.A5J6.01A.31R.A29S.07 ACC 48.4272123781731 47.6193563695876 0.883014008020216
ACC.TCGA.OR.A5J7.01A.11R.A29S.07 ACC 62.4042554219295 88.0026656237261 1.10932266858008
ACC.TCGA.OR.A5J8.01A.11R.A29S.07 ACC 63.01585481594 58.0195945444141 2.73551973326235
ACC.TCGA.OR.A5J9.01A.11R.A29S.07 ACC 57.4298354766427 56.3298507121662 1.79610641797267
ACC.TCGA.OR.A5JA.01A.11R.A29S.07 ACC 60.0455600490082 70.6751155732968 1.48756584540459
ACC.TCGA.OR.A5JB.01A.11R.A29S.07 ACC 65.6103767426594 84.6863428316355 1.84836941118404
數(shù)據(jù)處理
input_cp_final2$new_ann<-rep("no",nrow(input_cp_final2))
## 分類
input_cp_final2$new_ann[grep(input_cp_final2$patients,pattern="TCGA")]<-"TCGA"
input_cp_final2$new_ann[grep(input_cp_final2$patients,pattern="TCGA",invert=T)]<-"MET500"
input_cp_final2$new_ann<-as.factor(input_cp_final2$new_ann)
patients tumors mock_pseudospace tgfb_pseudospace EMT_scores new_ann
1 ACC.TCGA.OR.A5J1.01A.11R.A29S.07 ACC 52.39019 42.42779 1.232292 TCGA
2 ACC.TCGA.OR.A5J2.01A.11R.A29S.07 ACC 58.22194 52.13124 1.490977 TCGA
3 ACC.TCGA.OR.A5J3.01A.11R.A29S.07 ACC 51.58918 41.08373 1.364831 TCGA
4 ACC.TCGA.OR.A5J5.01A.11R.A29S.07 ACC 51.96321 46.46752 1.793465 TCGA
5 ACC.TCGA.OR.A5J6.01A.31R.A29S.07 ACC 48.42721 47.61936 0.883014 TCGA
6 ACC.TCGA.OR.A5J7.01A.11R.A29S.07 ACC 62.40426 88.00267 1.109323 TCGA
High與Low分類
input_cp_final2$emt_status<-ifelse(input_cp_final2$EMT_scores>0,"High_EMT","Low_EMT")
summary(input_cp_final2$mock_pseudospace)
input_cp_final2$pseudospace_status<-ifelse(input_cp_final2$mock_pseudospace<=50,"Late_Pseudotime","Early_Pseudotime")
patients tumors mock_pseudospace tgfb_pseudospace EMT_scores new_ann emt_status
1 ACC.TCGA.OR.A5J1.01A.11R.A29S.07 ACC 52.39019 42.42779 1.232292 TCGA High_EMT
2 ACC.TCGA.OR.A5J2.01A.11R.A29S.07 ACC 58.22194 52.13124 1.490977 TCGA High_EMT
3 ACC.TCGA.OR.A5J3.01A.11R.A29S.07 ACC 51.58918 41.08373 1.364831 TCGA High_EMT
4 ACC.TCGA.OR.A5J5.01A.11R.A29S.07 ACC 51.96321 46.46752 1.793465 TCGA High_EMT
5 ACC.TCGA.OR.A5J6.01A.31R.A29S.07 ACC 48.42721 47.61936 0.883014 TCGA High_EMT
6 ACC.TCGA.OR.A5J7.01A.11R.A29S.07 ACC 62.40426 88.00267 1.109323 TCGA High_EMT
pseudospace_status
1 Early_Pseudotime
2 Early_Pseudotime
3 Early_Pseudotime
4 Early_Pseudotime
5 Late_Pseudotime
繪圖
library(ggplot2)
library(ggridges)
library(corrplot)
ggplot(input_cp_final2,
aes(x=mock_pseudospace, y=EMT_scores,color=new_ann)) +
geom_point(shape=18,size=2,alpha=0.5)+scale_x_reverse()+
scale_color_manual(values=c('#ee0c0c','#B5B7FF'))+
geom_smooth(aes(group=new_ann),method = "lm", formula = y ~ poly(x, 10),se=TRUE, linetype="dashed",color="blue")+
geom_hline(yintercept=0, linetype="dashed", color = "black")+theme_classic()
input_cp_final3<-input_cp_final2[input_cp_final2$new_ann%in%"MET500",]
chisq <- chisq.test(table(input_cp_final3$pseudospace_status,input_cp_final3$emt_status)[2:1,])
contrib <- 100*chisq$residuals^2/chisq$statistic
corrplot(chisq$residuals, is.cor = FALSE)
corrplot(contrib, is.cor = FALSE)
「往期文章:」 「1. 最全WGCNA教程(替換數(shù)據(jù)即可出全部結(jié)果與圖形)」
「2. 精美圖形繪制教程」
小杜的生信筆記 场刑,主要發(fā)表或收錄生物信息學(xué)的教程,以及基于R的分析和可視化(包括數(shù)據(jù)分析铐懊,圖形繪制等)瞎疼;分享感興趣的文獻和學(xué)習(xí)資料!!