一痊焊、簡(jiǎn)介
limma應(yīng)用于RNA-seq數(shù)據(jù)時(shí),read counts被轉(zhuǎn)換為log2-counts-per-million(logCPM)甘萧÷氛伲可以有兩種方式對(duì)均值-方差的關(guān)系(mean-variance relationship)進(jìn)行建模:
(1)voom:精確權(quán)重(precision weights);
(2)limma-trend:經(jīng)驗(yàn)貝葉斯先驗(yàn)趨勢(shì)(empirical Bayes prior trend)栋烤。
在這兩種情況下,RNA-seq數(shù)據(jù)可以被作為微陣列數(shù)據(jù)(microarray data)進(jìn)行分析挺狰。這意味著limma包中的任何線(xiàn)性建拿鞴或基因集測(cè)試方法都可以應(yīng)用于RNA-seq數(shù)據(jù)买窟。
limma使用線(xiàn)性模型(linear models)的方法來(lái)分析設(shè)計(jì)的微陣列實(shí)驗(yàn)。該方法需要指定一個(gè)或兩個(gè)矩陣:
(1)第一種是設(shè)計(jì)矩陣(design matrix)薯定,它表明了每個(gè)陣列使用了哪些RNA樣本始绍。
(2)第二種是對(duì)比矩陣(contrast matrix),它指定了希望在RNA樣本之間進(jìn)行哪些比較话侄。
對(duì)于非常簡(jiǎn)單的實(shí)驗(yàn)亏推,可能不需要指定對(duì)比矩陣。
對(duì)于統(tǒng)計(jì)分析和評(píng)估差異表達(dá)年堆,limma使用了一種經(jīng)驗(yàn)貝葉斯方法(empirical Bayes method)來(lái)調(diào)整估計(jì)的倍數(shù)變化的標(biāo)準(zhǔn)誤差(moderate the standard errors of the estimated log-fold changes)吞杭。這導(dǎo)致了更穩(wěn)定的推理和更高的功效,特別是對(duì)于樣本量少的實(shí)驗(yàn)变丧。
基本過(guò)程:
- RNA-seq數(shù)據(jù)制作計(jì)數(shù)矩陣芽狗、樣本分組信息、基因注釋信息痒蓬。
- 構(gòu)建DGEList對(duì)象童擎,過(guò)濾filterByExpr(),TMM標(biāo)準(zhǔn)化calcNormFactors()攻晒。
- 準(zhǔn)備設(shè)計(jì)矩陣model.matrix()和對(duì)比矩陣makeContrasts()顾复。
- cpm()過(guò)濾和標(biāo)準(zhǔn)化后的數(shù)據(jù)轉(zhuǎn)換為logCPM。
- voom()轉(zhuǎn)換過(guò)濾和標(biāo)準(zhǔn)化后的數(shù)據(jù)鲁捏。eBayes()和treat()不需要trend=T芯砸。
- lmFit()通過(guò)為每一個(gè)基因擬合線(xiàn)性模型來(lái)估計(jì)fold changes和standard errors。
- contrasts.fit()將原始模型重新定向到對(duì)比模型碴萧,得到新的系數(shù)和標(biāo)準(zhǔn)誤差乙嘀。
- eBayes()使用trend=TRUE對(duì)標(biāo)準(zhǔn)誤差進(jìn)行經(jīng)驗(yàn)貝葉斯平滑,計(jì)算每個(gè)對(duì)比中每個(gè)基因的moderated t-statistic和log-odds破喻。
- treat()使用trend=TRUE對(duì)log-fold-changes大于某一非零閾值進(jìn)行檢驗(yàn)虎谢。
- topTable()給出一個(gè)最有可能在給定對(duì)比下差異表達(dá)的基因列表。
- topTableF()給出一個(gè)最有可能在給定的一組對(duì)比中差異表達(dá)的基因列表曹质。
- decideTests()展示所有對(duì)比結(jié)果婴噩。
二、RNA-seq數(shù)據(jù)的前處理
(1)制作計(jì)數(shù)矩陣
> head(countdata,3)
WTF1 WTF2 WTF3 WTM1 WTM2 WTM3 MF1 MF2 MF3 MF4 MF5 MMF1 MMF2
gene17427 0 46 210 0 26 0 40 0 0 0 0 0 0
gene17428 50 322 334 0 345 384 284 199 218 11 1 2 1
gene17429 0 0 0 6 0 0 0 0 0 0 5 0 0
(2)樣本分組信息
> Group
[1] WTF WTF WTF WTM WTM WTM MF MF MF MF MF MMF MMF
Levels: WTF WTM MF MMF
(3)基因注釋信息
> head(anno)
chromosome ref_gene_name
gene17427 4 CR32011
gene17428 4 CR32010
gene17429 4 CR32009
gene17490 4 CR44028
gene17501 4 CR33797
gene17507 4 CR43361
(4)使用edgeR羽德,將計(jì)數(shù)矩陣轉(zhuǎn)換為DGEList對(duì)象
> y <- DGEList(counts=countdata)
> y
An object of class "DGEList"
$counts
WTF1 WTF2 WTF3 WTM1 WTM2 WTM3 MF1 MF2 MF3 MF4 MF5 MMF1 MMF2
gene17427 0 46 210 0 26 0 40 0 0 0 0 0 0
gene17428 50 322 334 0 345 384 284 199 218 11 1 2 1
gene17429 0 0 0 6 0 0 0 0 0 0 5 0 0
gene17490 4 0 3 0 0 0 0 3 0 0 0 0 0
gene17501 2 2 3 1 0 3 4 0 3 0 0 0 0
34367 more rows ...
$samples
group lib.size norm.factors
WTF1 1 26485372 1
WTF2 1 21708533 1
WTF3 1 26038158 1
WTM1 1 14282144 1
WTM2 1 27198208 1
8 more rows ...
> y$samples$group <- Group #加入分組信息
> y$genes <- anno #加入基因注釋
(5)移除表達(dá)量非常低的基因
> #保留在至少在2個(gè)樣品中表達(dá)量大于min.CPM的基因
> keep <- rowSums(cpm(y)>1) >= 2
> yf <- y[keep,,keep.lib.sizes=FALSE] #重置了庫(kù)的大小lib.size
> y$samples$lib.size - yf$samples$lib.size
[1] 26499 39607 80456 25102 125345 105390 63741 24067 46952
[10] 37984 44858 35573 34754
> nrow(y) - nrow(yf) #減少的基因數(shù)
[1] 11844
> #另一種過(guò)濾方法filterByExpr
> keep <- filterByExpr(y,group=Group)
> yf <- y[keep,,keep.lib.sizes=FALSE]
> y$samples$lib.size - yf$samples$lib.size
[1] 11532 22890 58275 10496 88993 68069 35694 10642 24187 14905
[11] 22093 16030 19544
> nrow(y) - nrow(yf)
[1] 9521
用法:
filterByExpr(y, design = NULL, group = NULL, lib.size = NULL, ...)
filterByExpr(y, design = NULL, group = NULL, lib.size = NULL, min.count = 10, min.total.count = 15, large.n = 10, min.prop = 0.7, ...)
- y:DGEList對(duì)象
- design:設(shè)計(jì)矩陣几莽,如果group不等于NULL時(shí)design會(huì)被忽略
- group:表示分組的向量或因子
- lib.size:默認(rèn)為colSums(y)
- min.count:選取某基因表達(dá)量最小的樣本,這個(gè)樣品的最低計(jì)數(shù)要求宅静。
- min.total.count:最小的總計(jì)數(shù)要求章蚣。
- large.n:每一個(gè)group中,樣本數(shù)多少會(huì)被認(rèn)為是大樣本量姨夹。
- min.prop:最小分組中樣本數(shù)量的比例纤垂。
該方法保留在n個(gè)樣本中有大于k的CPM的基因矾策。k由min.count 和lib.size決定,n由設(shè)計(jì)矩陣決定峭沦。n是最小的分組的樣本量贾虽。如果所有的分組樣本量都大于large.n,這就會(huì)稍微放松吼鱼,但是n總是大于最小分組的min.prop(70%)蓬豁。此外,每個(gè)保留的基因都需要在所有樣本中至少有min.total.count的計(jì)數(shù)菇肃。
(6)計(jì)算CPM和logCPM
> #cpmyf <- cpm(yf)
> lcpmyf <- cpm(yf,log=T)
用法:
cpm(y,normalized.lib.sizes=T,log=F,prior.count=2)
- normalized.lib.sizes=T:使用標(biāo)準(zhǔn)化后的library sizes地粪。
effective library size = y$samples$lib.size * y$samples$norm.factors
- log=T:計(jì)算log2CPM
- prior.count=2:取對(duì)數(shù)前在每個(gè)觀測(cè)值上加一個(gè)數(shù),避免對(duì)0取對(duì)數(shù)巷送。這里使用prior count可以降低low counts取對(duì)數(shù)的方差驶忌。
- rpkm():會(huì)尋找
y$genes$Length或length
(7)箱線(xiàn)圖
> col.group <- Group
> levels(col.group) <- brewer.pal(nlevels(col.group),"Set1")
> col.group <- as.character(col.group)
> col.group
[1] "#E41A1C" "#E41A1C" "#E41A1C" "#377EB8" "#377EB8" "#377EB8"
[7] "#4DAF4A" "#4DAF4A" "#4DAF4A" "#4DAF4A" "#4DAF4A" "#984EA3"
[13] "#984EA3"
> boxplot(lcpmyf,las=2,col=col.group)
(8)TMM標(biāo)準(zhǔn)化(trimmed mean of M-values)
> yf <- calcNormFactors(yf,method = "TMM") #計(jì)算歸一化系數(shù)
> #標(biāo)準(zhǔn)化后raw read counts值并沒(méi)改變,只有norm.factors變了
> yf$samples$norm.factors
[1] 0.9027210 1.3505878 1.3102042 0.9202737 1.3636074 1.4155932
[7] 0.9726943 1.0979219 0.9623963 0.9108200 0.7308098 0.7473150
[13] 0.6892828
> ##獲得標(biāo)準(zhǔn)化后的數(shù)據(jù)
> #cpmyf <- cpm(yf) #已經(jīng)考慮歸一化系數(shù)了笑跛,等于cpmyf/norm.factors
> #t(t(yf$counts)/yf$samples$lib.size/yf$samples$norm.factors)*1000000
> lcpmyf <- cpm(yf,log=T)
> boxplot(lcpmyf,las=2,col=col.group)
三付魔、設(shè)計(jì)矩陣和對(duì)比矩陣
需要兩種矩陣:
- 設(shè)計(jì)矩陣(design matrix),表明了每個(gè)陣列使用了哪些RNA樣本飞蹂。
- 對(duì)比矩陣(contrast matrix)几苍,指定了希望在RNA樣本之間進(jìn)行哪些比較。對(duì)于非常簡(jiǎn)單的實(shí)驗(yàn)陈哑,可能不需要指定對(duì)比矩陣妻坝。
有兩種方法創(chuàng)建設(shè)計(jì)矩陣:
- create a design matrix which includes a coefficient for the mutant vs wild type difference, 將對(duì)照組作為截距,有一個(gè)系數(shù)表示實(shí)驗(yàn)組與對(duì)照組的差異惊窖。
- create a design matrix which includes separate coefficients for wild type and mutant mice and then extract the difference as a contrast.截距為0刽宪,對(duì)照組和實(shí)驗(yàn)組分別有自己的系數(shù)。
(1)對(duì)照組作為截距
> Group
[1] WTF WTF WTF WTM WTM WTM MF MF MF MF MF MMF MMF
Levels: WTF WTM MF MMF
> design <- model.matrix(~Group)
> colnames(design) <- c("WTF","WTMvsWTF","MFvsWTF","MMFvsWTF")
> design
WTF WTMvsWTF MFvsWTF MMFvsWTF
1 1 0 0 0
2 1 0 0 0
3 1 0 0 0
4 1 1 0 0
5 1 1 0 0
6 1 1 0 0
7 1 0 1 0
8 1 0 1 0
9 1 0 1 0
10 1 0 1 0
11 1 0 1 0
12 1 0 0 1
13 1 0 0 1
attr(,"assign")
[1] 0 1 1 1
attr(,"contrasts")
attr(,"contrasts")$Group
[1] "contr.treatment"
(2)截距為0界酒,對(duì)照組和實(shí)驗(yàn)組分別有自己的系數(shù)
> design <- model.matrix(~0+Group)
> colnames(design) <- c("WTF","WTM","MF","MMF")
> design
WTF WTM MF MMF
1 1 0 0 0
2 1 0 0 0
3 1 0 0 0
4 0 1 0 0
5 0 1 0 0
6 0 1 0 0
7 0 0 1 0
8 0 0 1 0
9 0 0 1 0
10 0 0 1 0
11 0 0 1 0
12 0 0 0 1
13 0 0 0 1
attr(,"assign")
[1] 1 1 1 1
attr(,"contrasts")
attr(,"contrasts")$Group
[1] "contr.treatment"
(3)對(duì)比矩陣
> cont.matrix <- makeContrasts(WTMvsWTF = WTM-WTF, MFvsWTF = MF-WTF,
+ MMFvsWTF = MMF-WTF, MMFvsMF = MMF-MF,
+ levels = colnames(design))
> cont.matrix
Contrasts
Levels WTMvsWTF MFvsWTF MMFvsWTF MMFvsMF
WTF -1 -1 -1 0
WTM 1 0 0 0
MF 0 1 0 -1
MMF 0 0 1 1
用法:
makeContrasts(..., contrasts=NULL, levels)
- contrasts:指定對(duì)比的字符型向量圣拄。
- levels:字符向量或因子,給出所需對(duì)比的參數(shù)的名稱(chēng)names毁欣,或設(shè)計(jì)矩陣等以參數(shù)名稱(chēng)為列名的對(duì)象庇谆。參數(shù)通常是線(xiàn)性模型擬合的系數(shù)。
- 此函數(shù)的輸出通常用作contrasts.fit的輸入凭疮。
> makeContrasts(B-A,C-B,C-A,levels=c("A","B","C"))
Contrasts
Levels B - A C - B C - A
A -1 0 -1
B 1 -1 0
C 0 1 1
(4)成對(duì)樣本的比較
成對(duì)樣本的比較出現(xiàn)在我們比較兩種處理?xiàng)l件饭耳,并且每個(gè)樣本給予一個(gè)處理自然地與一個(gè)給予另一種處理的特定的樣本配對(duì)的情況。與這種情況相關(guān)的經(jīng)典檢驗(yàn)是配對(duì)t檢驗(yàn)执解。Paired samples occur when we compare two treatments and each sample given one treatment is naturally paired with a particular sample given the other treatment. This is a special case of blocking with blocks of size two. The classical test associated with this situation is the paired t-test.
上述對(duì)于成對(duì)樣本的比較也可以用于存在batch effects或?qū)嶒?yàn)從不同的blocks進(jìn)行的情況寞肖。處理?xiàng)l件僅在每一個(gè)模塊內(nèi)進(jìn)行比較。
The above approach used for paired samples can be applied in any situation where there are batch effects or where the experiment has been conducted in blocks. The treatments can be adjusted for differences between the blocks by using a model formula of the form:
> design <- model.matrix(~Block+Treatment)
In this type of analysis, the treatments are compared only within each block.
四、差異表達(dá)分析
limma最初設(shè)計(jì)用于分析微陣列數(shù)據(jù)新蟆,但目前已擴(kuò)展到RNA-seq數(shù)據(jù)耕姊、DNA甲基化等。limma應(yīng)用于RNA-seq數(shù)據(jù)栅葡,也是基于raw count的定量方式,但是它并不提供歸一化算法尤泽,在官方手冊(cè)中推薦采用edgeR的TMM歸一化算法欣簇。read counts被轉(zhuǎn)換為log2-counts-per-million(logCPM),并假設(shè)log-CPM值符合正態(tài)分布坯约⌒苎剩可以有兩種方式調(diào)整均值-方差的關(guān)系(mean-variance relationship),從而對(duì)log-CPM值進(jìn)行線(xiàn)性建模:
- voom:精確權(quán)重(precision weights)闹丐;
- limma-trend:經(jīng)驗(yàn)貝葉斯先驗(yàn)趨勢(shì)(empirical Bayes prior trend)横殴。
在這兩種情況下,RNA-seq數(shù)據(jù)可以被作為微陣列數(shù)據(jù)(microarray data)進(jìn)行分析卿拴。這意味著limma包中的任何線(xiàn)性建纳缆兀或基因集測(cè)試方法都可以應(yīng)用于RNA-seq數(shù)據(jù)。
如果測(cè)序深度在RNA樣本之間差異不大(最大library size和最小library size相差3倍以?xún)?nèi))堕花,那么使用limma-trend是最簡(jiǎn)單和最穩(wěn)健的方法文狱。將計(jì)數(shù)數(shù)據(jù)轉(zhuǎn)化為logCPM,然后logCPM值可以在任何標(biāo)準(zhǔn)的limma pipeline中進(jìn)行分析缘挽。在運(yùn)行eBayes或treat時(shí)使用trend=TRUE瞄崇。
當(dāng)library sizes在樣本之間差距較大時(shí),voom方法在理論上比limma-trend更有效壕曼。
(1)limma-trend
- 數(shù)據(jù)轉(zhuǎn)換為logCPM
> lcpmyf <- cpm(yf,log=T)
- 設(shè)計(jì)矩陣
> design <- model.matrix(~0+Group)
> colnames(design) <- gsub("Group","",colnames(design))
> rownames(design) <- colnames(yf)
- 對(duì)比矩陣
> cont.matrix <- makeContrasts(WTMvsWTF = WTM-WTF, MFvsWTF = MF-WTF,
+ MMFvsWTF = MMF-WTF, MMFvsMF = MMF-MF,
+ levels = colnames(design))
> cont.matrix
Contrasts
Levels WTMvsWTF MFvsWTF MMFvsWTF MMFvsMF
WTF -1 -1 -1 0
WTM 1 0 0 0
MF 0 1 0 -1
MMF 0 0 1 1
- 擬合線(xiàn)性模型lmFit
> fit <- lmFit(lcpmyf, design)
- 轉(zhuǎn)換為對(duì)比模型contrasts.fit
> fit2 <- contrasts.fit(fit, cont.matrix)
- 經(jīng)驗(yàn)貝葉斯平滑eBayes
> fit3 <- eBayes(fit2, trend=TRUE)
- 提取某組比較的結(jié)果
> topTable(fit3, coef=ncol(design))
logFC AveExpr t P.Value adj.P.Val B
rna23260 9.733111 -2.401474 18.96688 6.770586e-10 1.525278e-05 7.506843
rna4289 8.671071 -2.564865 17.44008 1.691543e-09 1.905354e-05 7.248625
rna6574 7.687889 -2.716123 15.91767 4.555986e-09 2.608897e-05 6.933027
- 或者treat在基因排序中給予fold-changes更多權(quán)重
> fit4 <- treat(fit2, lfc=log2(1.2),trend=TRUE)
- 提取有fold-changes閾值的某組比較的結(jié)果
> topTreat(fit4, coef=ncol(design))
logFC AveExpr t P.Value adj.P.Val
rna23260 9.733111 -2.40147385 21.44573 2.000093e-10 4.505809e-06
rna4289 8.671071 -2.56486452 19.03239 7.024639e-10 7.912553e-06
rna34624 8.412221 -2.49050322 17.32005 1.908439e-09 1.433110e-05
- 提取所有比較結(jié)果
> dt <- decideTests(fit3)
> head(dt,3)
TestResults matrix
Contrasts
WTMvsWTF MFvsWTF MMFvsWTF MMFvsMF
gene17427 0 0 0 0
gene17428 0 0 0 0
gene17507 0 0 0 0
> summary(dt)
WTMvsWTF MFvsWTF MMFvsWTF MMFvsMF
Down 90 14 23 16
NotSig 20700 22477 22452 22481
Up 1738 37 53 31
- 保存為文件
> write.fit(fit3, dt, file="results.txt")
(2)voom
- voom轉(zhuǎn)換應(yīng)用于過(guò)濾后并標(biāo)準(zhǔn)化的DGEList對(duì)象
> v <- voom(yf, design, plot=TRUE)
> v #EList對(duì)象
- 在此之后苏研,可以使用通常的limma差異表達(dá)分析工作流程
> fit <- lmFit(v, design)
> fit2 <- contrasts.fit(fit, cont.matrix)
> fit3 <- eBayes(fit2)
- 繪制log2殘差標(biāo)準(zhǔn)差與log-CPM均值的關(guān)系
> plotSA(fit3, main="Final model: Mean-variance trend")
- 提取結(jié)果
> restop <- topTable(fit3, coef=ncol(design), n=Inf)
五、limma用法
(1)voom——Transform RNA-Seq Data Ready for Linear Modelling
voom(counts, design = NULL, lib.size = NULL, normalize.method = "none", block = NULL, correlation = NULL, weights = NULL, span = 0.5, plot = FALSE, save.plot = FALSE)
- counts:是一個(gè)包含raw counts的數(shù)值矩陣腮郊、或ExpressionSet摹蘑、DGEList對(duì)象。counts必須是非負(fù)數(shù)伴榔,并且不含有NAs纹蝴。
- design:設(shè)計(jì)矩陣。
- lib.size:每一個(gè)樣本庫(kù)的大小踪少。默認(rèn)為DGEList對(duì)象的normalized (effective) library sizes塘安。
-
plot:是否繪制mean-variance trend圖。
voom用法
(2)lmFit
lmFit(object, design=NULL, ndups=1, spacing=1, block=NULL, correlation, weights=NULL, method="ls", ...)
- object:一種類(lèi)似于矩陣的數(shù)據(jù)對(duì)象援奢,其中包含log-expression values兼犯。
- design:設(shè)計(jì)矩陣。
- method:擬合方法,"ls"表示最小二乘切黔,"robust"表示穩(wěn)健回歸砸脊。
(3)contrasts.fit
contrasts.fit(fit, contrasts=NULL, coefficients=NULL)
- fit:是lmFit的輸出。包含coefficients和stdev.unscaled纬霞。
- contrasts:數(shù)值矩陣凌埂,行對(duì)應(yīng)于擬合中的系數(shù),列包含對(duì)比诗芜。如果只有一個(gè)對(duì)比瞳抓,可能是一個(gè)向量》郑可以是makeContrasts得到的對(duì)比矩陣孩哑。
- coefficients:向量,指示哪些系數(shù)將保留在修改后的擬合對(duì)象中翠桦。 指定對(duì)比的另一種方法横蜒。
- contrasts.fit之后會(huì)得到coefficients、stdev.unscaled销凑、cov.coefficients丛晌,fit里面大多數(shù)組成成分會(huì)被保留,但是t闻鉴、p.value茵乱、lods、F孟岛、F.p.value會(huì)被移除瓶竭。
(4)eBayes
eBayes(fit, proportion = 0.01, stdev.coef.lim = c(0.1,4), trend = FALSE, robust = FALSE, winsor.tail.p = c(0.05,0.1))
treat(fit, lfc = log2(1.2), trend = FALSE, robust = FALSE, winsor.tail.p = c(0.05,0.1))
- 給定一個(gè)微陣列線(xiàn)性模型擬合,計(jì)算moderated t-statistics渠羞、moderated F-statistic和差異表達(dá)的logodds斤贰,通過(guò)經(jīng)驗(yàn)Bayes將標(biāo)準(zhǔn)誤差調(diào)整到一個(gè)共同的值。
- fit:lmFit或contrasts.fit得到的擬合模型次询。
- proportion:0-1之間的數(shù)荧恍,差異表達(dá)基因的假定比例。
- trend:對(duì)于先驗(yàn)方差是否允許一個(gè)intensity-trend屯吊。默認(rèn)先驗(yàn)方差是常數(shù)送巡。
- robust:df.prior和var.prior的估計(jì)是否穩(wěn)健地反對(duì)離群樣本方差(be robustified against outlier sample variances)。
- lfc:被認(rèn)為具有科學(xué)意義的最小log2倍數(shù)變化盒卸。
(5)toptable
topTable(fit, coef=NULL, number=10, genelist=fit
genes, adjust.method="BH", sort.by="F", p.value=1, lfc=0)
topTreat(fit, coef=1, sort.by="p", resort.by=NULL, ...)
- fit:由lmFit和eBayes/treat產(chǎn)生骗爆。
- coef:列號(hào)或列名,指定線(xiàn)性模型的哪個(gè)系數(shù)或?qū)Ρ仁歉信d趣的蔽介。對(duì)于topTable摘投,也可以是列下標(biāo)的向量煮寡,在這種情況下,基因排序是按照這組對(duì)比的F-statistic犀呼。
- number:列出的基因的最大數(shù)量幸撕。
- sort.by:按照什么對(duì)基因進(jìn)行排序。對(duì)于topTable可以是"logFC"/"M", "AveExpr"/"A"/"Amean", "t"/"T", "P"/"p", "B" or "none"外臂。對(duì)于topTableF可以是"F" or "none"坐儿。對(duì)于topTreat,與topTable相同宋光,除了"B"挑童。
- p.value:校正后的p值的閾值,只有小于這個(gè)值的基因才會(huì)被列出跃须。
- lfc:要求的最小absolute log2-fold-change。topTable和topTableF只列出log-fold-changes大于設(shè)置的lfc值的基因娃兽。在topTreat這個(gè)argument不可用菇民,不會(huì)去除基因,但根據(jù)它們的對(duì)數(shù)變化超過(guò)lfc的證據(jù)對(duì)基因進(jìn)行排序投储。
- confint:是否輸出logFC的95%置信區(qū)間第练。或者在0-1之間取值作為置信度玛荞。
(6)decideTests
decideTests(object, method = "separate", adjust.method = "BH", p.value = 0.05, lfc = 0, ...)
- object:一個(gè)p值的數(shù)值矩陣娇掏,或eBayes和treat產(chǎn)生的對(duì)象,從中可以提取p值和t統(tǒng)計(jì)量勋眯。
- method:指定如何在多個(gè)測(cè)試方案中組合基因和對(duì)比婴梧。可以是"separate", "global", "hierarchical" or "nestedF"客蹋。
- method="separate"將對(duì)p值的每一列分別應(yīng)用多重測(cè)試調(diào)整塞蹭。設(shè)置method="separate"等價(jià)于對(duì)線(xiàn)性模型擬合中的每個(gè)系數(shù)分別使用topTable的方法。method="global"將把t統(tǒng)計(jì)量的整個(gè)矩陣看作一個(gè)不相關(guān)檢驗(yàn)的單一向量讶坯。當(dāng)需要在所有對(duì)比中有相同的t統(tǒng)計(jì)量閾值時(shí)番电,method="global"是有用的。
- adjust.method:可以是"none", "BH", "fdr" (equivalent to "BH"), "BY" and "holm"辆琅。
- p.value:0-1的數(shù)值漱办,給出要求的family-wise error rate or false discovery rate。
- lfc:要求的最小absolute log2-fold-change婉烟。
- 結(jié)果用-1,0,1表示significantly negative, not significant, significantly positive娩井。
(7)write.fit
write.fit(fit, results = NULL, file, digits = NULL, adjust = "none", method = "separate", F.adjust = "none", quote = FALSE, sep = "\t", row.names = TRUE, ...)
This function writes a tab-delimited text file containing for each gene (1) the average log2-intensity, (2) the coefficients or contrasts (log2-fold-changes), (3) moderated t-statistics, (4) t-statistic P-values, (5) F-statistic if available, (6) F-statistic P-values if available, (7) classification if available and (8) gene names and annotation.
> dt <- decideTests(fit3)
> write.fit(fit3, dt, file="resultsvoom.txt")
> restxt <- read.table("resultsvoom.txt")
> colnames(restxt)
[1] "A" "Coef.WTMvsWTF"
[3] "Coef.MFvsWTF" "Coef.MMFvsWTF"
[5] "Coef.MMFvsMF" "t.WTMvsWTF"
[7] "t.MFvsWTF" "t.MMFvsWTF"
[9] "t.MMFvsMF" "p.value.WTMvsWTF"
[11] "p.value.MFvsWTF" "p.value.MMFvsWTF"
[13] "p.value.MMFvsMF" "F"
[15] "F.p.value" "Res.WTMvsWTF"
[17] "Res.MFvsWTF" "Res.MMFvsWTF"
[19] "Res.MMFvsMF" "Genes.chromosome"
[21] "Genes.ref_gene_name"