原文鏈接:Simulated-07-Membership.pdf (ucla.edu)
7. 模塊成員甩恼、模塊內(nèi)連通性和篩選模塊內(nèi)hub基因
7a.模塊內(nèi)連通性
(In network literature, connectivity is often referred to as ”degree”.)
intramodularConnectivity
是用來計(jì)算整個(gè)網(wǎng)絡(luò)的連通性kTotal,和每一個(gè)模塊內(nèi)的連通性kWithin,
kOut=kTotal-kWithin
kDiff=kIn-kOut=2*kIN-kTotal
ADJ1=abs(cor(datExpr,use="p"))^6
Alldegrees1=intramodularConnectivity(ADJ1, colorh1)
head(Alldegrees1)
參考例子的計(jì)算結(jié)果如下:
head(Alldegrees1)
kTotal kWithin kOut kDiff
Gene1 31.80186 28.37595 3.425906 24.95005
Gene2 28.88249 26.47896 2.403522 24.07544
Gene3 25.38600 23.11852 2.267486 20.85103
Gene4 24.01574 22.12962 1.886122 20.24350
Gene5 24.93663 21.69175 3.244881 18.44687
Gene6 25.91260 23.92613 1.986469 21.93966
7b gene significance 和 intramodular connectivity之間的關(guān)系
繪制gene significance和intramodular connectivity關(guān)系圖
colorlevels=unique(colorh1)
sizeGrWindow(9,6)
par(mfrow=c(2,as.integer(0.5+length(colorlevels)/2)))
par(mar = c(4,5,3,1))
for (i in c(1:length(colorlevels)))
{
whichmodule=colorlevels[[i]];
restrict1 = (colorh1==whichmodule);
verboseScatterplot(Alldegrees1$kWithin[restrict1],
GeneSignificance[restrict1], col=colorh1[restrict1],
main=whichmodule,
xlab = "Connectivity", ylab = "Gene Significance", abline = TRUE)
}
由圖中可以看出綠色和棕色模塊gene significance 和 connectivity 的相關(guān)性更高少欺。
7c 讓模塊內(nèi)所有基因的連接度泛化
intramodular connectivity 這個(gè)概念只是對(duì)給出的模塊間的基因進(jìn)行定義叔扼,但在實(shí)際的生物過程中基因如何連接其他的生物學(xué)模塊是非常重要的玲销。因此,他們定義了一個(gè)基于特征基因連通性模塊的測(cè)量方法來解釋基因表達(dá)和模塊特征基因的相關(guān)性( Toward this end, we define a module eigengene-based connectivity measure for each gene as the correlation between a the gene expression and the module eigengene.)甥桂。
note:在這里基因i不一定在brown模塊里涨椒。
他們?yōu)槊總€(gè)模塊都構(gòu)建了module membership(MM)的值,過去他們把這個(gè)值稱作kME谴餐。
datKME=signedKME(datExpr, datME, outputColumnName="MM.")
# Display the first few rows of the data frame
head(datKME)
輸出為:
每個(gè)基因在每一個(gè)module里面都會(huì)有一個(gè)對(duì)應(yīng)的值。
7d 在感興趣的模塊中挖掘具有高gene significance和高intramodular connectivity的基因
在確定 了與特定trait高相關(guān)的棕色module之后呆抑,接下來我們需要找到在棕色module中高gene significance和高intramodular connectivity的基因总寒。
在這里設(shè)定的篩選值分別為0.2和0.8。
FilterGenes= abs(GS1)> .2 & abs(datKME$MM.brown)>.8
table(FilterGenes)
結(jié)果如下:查看過濾掉的基因名稱
dimnames(data.frame(datExpr))[[2]][FilterGenes]
7e module membership measures (e.g. MM.turquoise) 和intramodular connectivity之間的關(guān)系
sizeGrWindow(8,6)
par(mfrow=c(2,2))
# We choose 4 modules to plot: turquoise, blue, brown, green.
# For simplicity we write the code out explicitly for each module.
which.color="turquoise";
restrictGenes=colorh1==which.color
verboseScatterplot(Alldegrees1$kWithin[ restrictGenes],
(datKME[restrictGenes, paste("MM.", which.color, sep="")])^6,
col=which.color,
xlab="Intramodular Connectivity",
ylab="(Module Membership)^6")
which.color="blue";
restrictGenes=colorh1==which.color
verboseScatterplot(Alldegrees1$kWithin[ restrictGenes],
(datKME[restrictGenes, paste("MM.", which.color, sep="")])^6,
col=which.color,
xlab="Intramodular Connectivity",
ylab="(Module Membership)^6")
which.color="brown";
restrictGenes=colorh1==which.color
verboseScatterplot(Alldegrees1$kWithin[ restrictGenes],
(datKME[restrictGenes, paste("MM.", which.color, sep="")])^6,
col=which.color,
xlab="Intramodular Connectivity",
ylab="(Module Membership)^6")
which.color="green";
restrictGenes=colorh1==which.color
verboseScatterplot(Alldegrees1$kWithin[ restrictGenes],
(datKME[restrictGenes, paste("MM.", which.color, sep="")])^6,
col=which.color,
xlab="Intramodular Connectivity",
ylab="(Module Membership)^6")
結(jié)果如下
所以module membership和intramodular connectivity是具有高度相關(guān)性的理肺?對(duì)每個(gè)module都是這樣。
7e 基于詳細(xì)定義module membership的基因篩選方法
這個(gè)包是基于gene significance 和 module membership基因篩選的善镰。
NS1=networkScreening(y=y, datME=datME, datExpr=datExpr,
oddPower=3, blockSize=1000, minimumSampleSize=4,
addMEy=TRUE, removeDiag=FALSE, weightESy=0.5)
接下來我們比較一下detailed network screening analysis和這種標(biāo)準(zhǔn)的篩選方法妹萨。有多少基因是干擾基因。
# network screening analysis
mean(NoiseGeneIndicator[rank(NS1$p.Weighted,ties.method="first")<=100])
# standard analysis based on the correlation p-values (or Student T test)
mean(NoiseGeneIndicator[rank(NS1$p.Standard,ties.method="first")<=100])
在這里炫欺, network screening analysis得到了一個(gè)噪音基因數(shù)量較少的前100名名單乎完。我們現(xiàn)在將基于WGCNA的篩選(基于p.Weighted)與標(biāo)準(zhǔn)篩選進(jìn)行比較,通過評(píng)估噪音基因的比例 在通過對(duì)p值進(jìn)行排序而產(chǎn)生的基因列表中的噪音基因比例品洛。
topNumbers=c(10,20,50,100)
for (i in c(1:length(topNumbers)) )
{
print(paste("Proportion of noise genes in the top", topNumbers[i], "list"))
WGCNApropNoise=mean(NoiseGeneIndicator[rank(NS1$p.Weighted,ties.method="first")<=topNumbers[i]])
StandardpropNoise=mean(NoiseGeneIndicator[rank(NS1$p.Standard,ties.method="first")<=topNumbers[i]])
print(paste("WGCNA, proportion of noise=", WGCNApropNoise,
", Standard, prop. noise=", StandardpropNoise))
if (WGCNApropNoise< StandardpropNoise) print("WGCNA wins")
if (WGCNApropNoise==StandardpropNoise) print("both methods tie")
if (WGCNApropNoise>StandardpropNoise) print("standard screening wins")
}
我們可以看到WGCNA的篩選方法有更少的噪音基因树姨。
在進(jìn)行下一步之前,需要?jiǎng)h除一些大的數(shù)據(jù)集:
rm(dissTOM); collectGarbage()
7g. 比較weighted correlation和standard Pearson correlation
#Form a data frame containing standard and network screening results
CorPrediction1=data.frame(GS1,NS1$cor.Weighted)
cor.Weighted=NS1$cor.Weighted
# Plot the comparison
sizeGrWindow(8, 6)
verboseScatterplot(cor.Weighted, GS1,
main="Network-based weighted correlation versus Pearson correlation\n",
col=truemodule, cex.main = 1.2)
abline(0,1)