新開了jimmy老師開的服務(wù)器脱羡,下載數(shù)據(jù)還是跑流程,感覺飛起來了一下免都,感激萬分锉罐。
言歸正傳:下面是新的流程:
文章:兩篇:
①、Cancers | Free Full-Text | Targeting Palbociclib-Resistant Estrogen Receptor-Positive Breast Cancer Cells via Oncolytic Virotherapy | HTML
https://www.mdpi.com/2072-6694/11/5/684/htm
②绕娘、Genes | Free Full-Text | Transcriptomic Profiling Identifies Differentially Expressed Genes in Palbociclib-Resistant ER+ MCF7 Breast Cancer Cells
https://www.mdpi.com/2073-4425/11/4/467
主要是研究:差異表達(dá)的基因和參與對palbociclib耐藥性發(fā)展的途徑(乳腺癌)
流程:
1脓规、下載數(shù)據(jù):
conda create -n rnaseq python=2 bwa
source activate rnaseq
#安裝軟件aspera
wget [http://d3gcli72yxqn2z.cloudfront.net/connect/bin/aspera-connect-3.5.1.92523-linux-64.tar.gz](http://d3gcli72yxqn2z.cloudfront.net/connect/bin/aspera-connect-3.5.1.92523-linux-64.tar.gz)
tar zxf aspera-connect-3.5.1.92523-linux-64.tar.gz
bash aspera-connect-3.5.1.92523-linux-64.sh
echo 'PATH=$PATH:~/.aspera/connect/bin/' >> ~/.bashrc
source ~/.bashrc
ascp --help
#注意下一下,這里用bash险领。而不是sh侨舆。之前用sh會彈出很多錯誤
############
#用aspera下載數(shù)據(jù)。比prefetch绢陌,應(yīng)該快不少
cat 'SRR_Acc_List (1).txt'|while read id
do
x=$(echo $id | cut -b1-6)
y=$(echo $id | cut -b10-10)
echo $id
ascp -QT -l 300m -P33001 -i \
~/.aspera/connect/etc/asperaweb_id_dsa.openssh \
[era-fasp@fasp.sra.ebi.ac.uk](mailto:era-fasp@fasp.sra.ebi.ac.uk):/vol1/fastq/$x/00$y/$id/ ./
done
gzip -d SRR*gz
2挨下、質(zhì)控
#安裝軟件:
conda install -y sra-tools
conda install -c bioconda multiqc
#質(zhì)檢
ls *fastq|xargs fastqc -t 10
multiqc .
瀏覽器打開multiqc_report.html。
3脐湾、比對:
#軟件用hisat2吧
conda install -y hisat2
conda install -samtools
########
#比對
for ((i=23;i<=34;i++));
do
hisat2 -p 6 -x /home/data/server/reference/index/hisat/hg38/genome
-U /home/data/gmb29/data/chip_seq/SRR89845${i}.fastq
-S SRR89845${i}.sam ;
done
4臭笆、sam轉(zhuǎn)bam,bam排序秤掌,計(jì)數(shù)gtf
#sam轉(zhuǎn)bam
for ((i=23;i<=34;i++));
do
samtools view -@ 6 -bS -h SRR89845${i}.sam > SRR89845${i}.bam ;
done
###################
#bam排序
for ((i=23;i<=34;i++));
do
samtools sort -@ 6 SRR89845${i}.bam -o SRR89845${i}.sort ;
done
#計(jì)數(shù)gft
featureCounts -T 6 -t \
-t exon -g gene_id \
-a /home/data/server/reference/gtf/ensembl/Homo_sapiens.GRCh38.98.chr.gtf.gz -o all.id.txt *.sort
得到了all -id的計(jì)數(shù)愁铺。
原文的流程是:tophat2 ——cufflinks