有些時(shí)候般哼,很多分析明知道不能做把敢,還不得不去做。
比如蜈亩,只有TPM值情況下的基因差異表達(dá)分析:
ExM # log2TPM 表達(dá)矩陣
s1 # 屬于類型1(如 tumor)的所有樣本ID
s2 # 屬于類型2(如 normal)的所有樣本ID
cat("wilcox.test\n")
pvalue = padj = log2FoldChange = matrix(0, nrow(ExM), 1)
for(i in 1:nrow(ExM)){
pvalue[i, 1] = p.value = wilcox.test(ExM[i, s1], ExM[i, s2])$p.value
log2FoldChange[i, 1] = log2(mean(ExM[i, s1]) /mean(ExM[i, s2]))
}
padj = p.adjust(as.vector(pvalue), "fdr", n = length(pvalue))
rTable = data.frame(log2FoldChange, pvalue, padj, row.names = rownames(ExM))
treatment_Log2TPM <- signif(apply(ExM[rownames(rTable), s1], 1, mean), 4)
control_Log2TPM <- signif(apply(ExM[rownames(rTable), s2], 1, mean), 4)
cat("mark DGE\n")
DGE <- rep("NC", nrow(ExM))
DGE[((rTable$padj) < 0.05) & (rTable$log2FoldChange > 0)] = "UP"
DGE[((rTable$padj) < 0.05) & (rTable$log2FoldChange < 0)] = "DN"
gene = rownames(ExM)
rTable = data.frame(treatment_Log2TPM, control_Log2TPM, rTable[, c("log2FoldChange", "pvalue", "padj")], DGE)
head(rTable)
treatment_Log2TPM control_Log2TPM log2FoldChange
ENSG00000166535.20 A2ML1 1.4870 1.8410 -0.3536611345
ENSG00000175899.15 A2M 14.3500 14.1000 0.2527242657
ENSG00000197953.6 AADACL2 0.1622 0.1491 0.0131439534
ENSG00000204518.2 AADACL4 0.1487 0.1492 -0.0005166819
ENSG00000115977.19 AAK1 9.6250 9.7070 -0.0817361189
ENSG00000127837.9 AAMP 11.2000 11.2000 0.0019474297
pvalue padj DGE
ENSG00000166535.20 A2ML1 0.19797430 0.3930997 NC
ENSG00000175899.15 A2M 0.13120671 0.2997906 NC
ENSG00000197953.6 AADACL2 0.09516201 0.2405597 NC
ENSG00000204518.2 AADACL4 0.52208746 0.7067366 NC
ENSG00000115977.19 AAK1 0.44455824 0.6436331 NC
ENSG00000127837.9 AAMP 0.91096435 0.9563070 NC
代碼來自一篇被刪掉的帖子懦窘,估計(jì)他也覺得不能做
wilcox.test 進(jìn)行 RNAseq 差異表達(dá)基因分析
事實(shí)上,不同處理之間堅(jiān)決不能這樣做稚配。
而有些時(shí)候畅涂,這樣還是能接受的,比如同一個(gè)樣本/處理下的同源基因?qū)χg的差異表達(dá)分析道川。