表觀組數(shù)據(jù)在call-peak的時候選取的方法還是很多的程奠,這里簡單記錄一下
1 先call-peak后取peak交集
可以使用 IDR統(tǒng)計一致性較好的peak然后bedtools intersect合并peak
idr 安裝參考鏈接
The IDR (Irreproducible Discovery Rate) framework is a uni?ed approach to measure the reproducibility of ?ndings identi?ed from replicate experiments and provide highly stable thresholds based on reproducibility.
例子
echo "idr --samples A${id}K4_peaks.broadPeak C${id}K4_peaks.broadPeak --input-file-type broadPeak --output-file ACK4-${id} --plot --rank p.value ">>ACK4.sh
#得到圖片還有一致性peak文件
NC_045731.1 18482515 18484118 . 1000 . -1 261.55000 -1 5.000000 5.000000 18482520 18483972 261.55000 18482515 18484118 677.33400
NW_022587827.1 45181 47485 . 1000 . -1 177.63400 -1 5.000000 5.000000 45181 46596 177.63400 45193 47485 414.37100
NC_045731.1 18515047 18516017 . 1000 . -1 134.81900 -1 5.000000 5.000000 18515068 18515901 134.81900 18515047 18516017 391.59900
一致性較好的peak可以使用bedtools intersect合并
bedtools intersect [OPTIONS] -a <FILE> \
-b <FILE1, FILE2, ..., FILEN>
2先合并bam文件后callpeak
首先對于生物學(xué)重復(fù)bam使用deeptools的multiBamSummary進(jìn)行correlations 統(tǒng)計
multiBamSummary computes the read coverages for genomic regions for typically two or more BAM files. The analysis can be performed for the entire genome by running the program in ‘bins’ mode. If you want to count the read coverage for specific regions only, use the BED-file mode instead. The standard output of multiBamSummary is a compressed numpy array (.npz). It can be directly used to calculate and visualize pairwise correlation values between the read coverages using the tool ‘plotCorrelation’. Similarly,
multiBamSummary bins --bamfiles file1.bam file2.bam -o results.npz
##生成的npz文件可以做主成分分析宣脉,plotCorrelation分析
plotCorrelation -in x.npz --skipZeros --corMethod pearson --whatToPlot heatmap --colorMap RdYlBu_r --plotNumbers -o x.pdf --outFileCorMatrix x.tab
相關(guān)性系數(shù)較好的可以進(jìn)行bam合并
samtools merge [options] -o <out.bam> [options] <in1.bam> ... <inN.bam>
samtools merge [options] <out.bam> <in1.bam> ... <inN.bam>
那種方法好要結(jié)合自己的數(shù)據(jù)測序深度宁昭,文庫質(zhì)量而選擇,可以先call-peak看看peak數(shù)量相种,idr看看一致性再做決定威恼。