安裝
homer[安裝參考]http://homer.ucsd.edu/homer/introduction/install.html
利用conda安裝劫狠,
conda info -e # 先查看當(dāng)前環(huán)境
conda create -n chipseq #創(chuàng)建chipseq環(huán)境
conda activate chipseq # 切換環(huán)境
conda install -c bioconda homer #可以直接conda安裝莉测,也可以按照上面鏈接峦失,參考安裝手冊(cè)安裝
數(shù)據(jù)準(zhǔn)備
將上調(diào)或下調(diào)的差異基因準(zhǔn)備成txt文件鸵荠,第一列是基因名
HOMER 可以接受的基因編號(hào)類型:
NCBI Entrez Gene IDs
NCBI Unigene IDs
NCBI Refseq IDs (mRNA, protein)
Ensembl Gene IDs
Gene Symbols (i.e. Official Gene names, like "Nfkb1" ) popular affymetrix probe IDs (MOE430, U133plus, U95, U75A)
運(yùn)行 findMotifs.pl
[參考]http://homer.ucsd.edu/homer/microarray/index.html
findMotifs.pl 需要輸入3個(gè) inputdata:①上一步中的genelist(txt文件每行一個(gè)基因)牵敷,②輸入物種名mouse/human虽界,③輸出的結(jié)果目錄名稱(程序運(yùn)行后在當(dāng)前目錄下自動(dòng)生成)
#運(yùn)行
#cd到upregulated_gene.txt 文件所在目錄運(yùn)行
findMotifs.pl upregulated_gene.txt mouse up_homer_res/ -start -400 -end 100 -len 8,10 -p 4
# This will search for motifs of length 8 and 10 from -400 to +100 relative to the TSS, using 4 threads (i.e. 4 CPUs)炫加。
#/ -start -400 -end 100 -len 8,10 -p 4是默認(rèn)參數(shù)可以不加
# 主要輸出結(jié)果在 HTML files 中
在R中寫一個(gè)匯總函數(shù)
[參考]https://blog.csdn.net/jaychouwong/article/details/119827776
# 這里主要看homerResults文件夾,為預(yù)測(cè)的潛在的TF結(jié)合motif
subString <- function(strings, idx, sep = NA){
strings = as.character(strings)
if(is.na(sep)){
res = as.character(lapply(strings, function(x) paste(strsplit(x, "")[[1]][idx], collapse = "")))
} else{
res = sapply(strsplit(strings, sep), function(x) x[idx])
}
return(res)
}
summaryHomer <- function(outFolder){
homerFolder = paste0(outFolder, "/homerResults")
xFiles = list.files(homerFolder, ".motif$")
xFiles = xFiles[-grep("similar", xFiles)]
xFiles = xFiles[-grep("RV", xFiles)]
xFiles = xFiles[order(as.numeric(gsub("\\.", "", gsub("motif", "", xFiles))))]
texts = sapply(paste0(homerFolder, "/", xFiles), readLines)
chunks = sapply(texts, function(x) strsplit(x[1], "[\t]"))
motif = sapply(chunks, function(x) subString(x[1], 2, ">"))
match = sapply(chunks, function(x) subString(subString(x[2], 2, "BestGuess:"), 1, "/"))
score = sapply(chunks, function(x) rev(strsplit(x[2], "[()]")[[1]])[1])
count = sapply(chunks, function(x) subString(x[6], 3, "[T:()]"))
ratio = sapply(chunks, function(x) subString(x[6], 2, "[()]"))
p_value = sapply(chunks, function(x) subString(x[6], 2, "P:"))
xresT = data.frame(motif,
match,
score = as.numeric(score),
count = as.numeric(count),
ratio_perc = as.numeric(gsub("%", "", ratio)),
p_value = as.numeric(p_value)
)
rownames(xresT) = gsub(".motif", "", basename(rownames(xresT)))
return(xresT)
}
upregulated_gene_res <- summaryHomer('~/project/homer/up_homer_res') # 只需要提供homer的輸出目錄泻轰,我們來使用一下:
motif: 預(yù)測(cè)的motif序列技肩,正鏈
match: 預(yù)測(cè)的TF
score: 匹配度,1為完全匹配
count: 你輸入的基因中包含該motif的基因個(gè)數(shù)
ratio: 你輸入的基因中包含該motif的基因個(gè)數(shù)占總輸入基因個(gè)數(shù)的比例
p_value: 置信度
下載 Homer Packages
[鏈接]http://homer.ucsd.edu/homer/introduction/configure.html
上述findMotifs.pl在分析motif時(shí)是需要指定物種的,其promotor數(shù)據(jù)包已經(jīng)提前下載好了虚婿。Homer軟件安裝在~/programm/homer目錄下旋奢,configureHomer.pl是安裝homer和其數(shù)據(jù)包的腳本文件,利用此腳本可以下載homer軟件及所有數(shù)據(jù)包(已下載好然痊,在homer安裝目錄下) [configureHomer.pl] http://homer.ucsd.edu/homer/configureHomer.pl
# 需要cd到homer安裝目錄下進(jìn)行如下操作
perl ~/programm/homer/configureHomer.pl -list #查看可供下載的數(shù)據(jù)包
#To install packages, simply use the -install option and the name(s) of the package(s).
perl ~/programm/homer/configureHomer.pl -install mouse-p (to download the mouse promoter set)
perl ~/programm/homer/configureHomer.pl -install mm8 (to download the mm8 version of the mouse genome)
perl ~/programm/homer/configureHomer.pl -install hg19 (to download the hg19 version of the human genome)
#Updating Homer. To update Homer, simply type:
perl ~/programm/homer/configureHomer.pl -update
#Or, alternatively you can simply force the reinstallation of the basic software...
perl ~/programm/homer/configureHomer.pl -install homer #將configureHomer.pl放在將要安裝homer的目錄下運(yùn)行至朗。
# 將PATH=$PATH:/Users/chucknorris/homer/bin/ 加入 ~/.bashrc中