linux學(xué)習(xí)100篇101:轉(zhuǎn)錄組分析用軟件及安裝subread

安裝

(rnaseq) root 12:08:22 ~
$ conda install -y subread
Collecting package metadata (current_repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.9.2
  latest version: 4.10.1

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /root/miniconda3/envs/rnaseq

  added / updated specs:
    - subread


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    subread-2.0.1              |       h5bf99c6_1        22.8 MB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/bioconda
    ------------------------------------------------------------
                                           Total:        22.8 MB

The following NEW packages will be INSTALLED:

  subread            anaconda/cloud/bioconda/linux-64::subread-2.0.1-h5bf99c6_1



Downloading and Extracting Packages
subread-2.0.1        | 22.8 MB   | ######################################## | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
(rnaseq) root 12:09:51 ~

查看

$ featureCounts

Version 2.0.1

Usage: featureCounts [options] -a <annotation_file> -o <output_file> input_file1 [input_file2] ... 

## Mandatory arguments:

  -a <string>         Name of an annotation file. GTF/GFF format by default. See
                      -F option for more format information. Inbuilt annotations
                      (SAF format) is available in 'annotation' directory of the
                      package. Gzipped file is also accepted.

  -o <string>         Name of output file including read counts. A separate file
                      including summary statistics of counting results is also
                      included in the output ('<string>.summary'). Both files
                      are in tab delimited format.

  input_file1 [input_file2] ...   A list of SAM or BAM format files. They can be
                      either name or location sorted. If no files provided,
                      <stdin> input is expected. Location-sorted paired-end reads
                      are automatically sorted by read names.

## Optional arguments:
# Annotation

  -F <string>         Specify format of the provided annotation file. Acceptable
                      formats include 'GTF' (or compatible GFF format) and
                      'SAF'. 'GTF' by default.  For SAF format, please refer to
                      Users Guide.

  -t <string>         Specify feature type(s) in a GTF annotation. If multiple
                      types are provided, they should be separated by ',' with
                      no space in between. 'exon' by default. Rows in the
                      annotation with a matched feature will be extracted and
                      used for read mapping. 

  -g <string>         Specify attribute type in GTF annotation. 'gene_id' by 
                      default. Meta-features used for read counting will be 
                      extracted from annotation using the provided value.

  --extraAttributes   Extract extra attribute types from the provided GTF
                      annotation and include them in the counting output. These
                      attribute types will not be used to group features. If
                      more than one attribute type is provided they should be
                      separated by comma.

  -A <string>         Provide a chromosome name alias file to match chr names in
                      annotation with those in the reads. This should be a two-
                      column comma-delimited text file. Its first column should
                      include chr names in the annotation and its second column
                      should include chr names in the reads. Chr names are case
                      sensitive. No column header should be included in the
                      file.

# Level of summarization

  -f                  Perform read counting at feature level (eg. counting 
                      reads for exons rather than genes).

# Overlap between reads and features

  -O                  Assign reads to all their overlapping meta-features (or 
                      features if -f is specified).

  --minOverlap <int>  Minimum number of overlapping bases in a read that is
                      required for read assignment. 1 by default. Number of
                      overlapping bases is counted from both reads if paired
                      end. If a negative value is provided, then a gap of up
                      to specified size will be allowed between read and the
                      feature that the read is assigned to.

  --fracOverlap <float> Minimum fraction of overlapping bases in a read that is
                      required for read assignment. Value should be within range
                      [0,1]. 0 by default. Number of overlapping bases is
                      counted from both reads if paired end. Both this option
                      and '--minOverlap' option need to be satisfied for read
                      assignment.

  --fracOverlapFeature <float> Minimum fraction of overlapping bases in a
                      feature that is required for read assignment. Value
                      should be within range [0,1]. 0 by default.

  --largestOverlap    Assign reads to a meta-feature/feature that has the 
                      largest number of overlapping bases.

  --nonOverlap <int>  Maximum number of non-overlapping bases in a read (or a
                      read pair) that is allowed when being assigned to a
                      feature. No limit is set by default.

  --nonOverlapFeature <int> Maximum number of non-overlapping bases in a feature
                      that is allowed in read assignment. No limit is set by
                      default.

  --readExtension5 <int> Reads are extended upstream by <int> bases from their
                      5' end.

  --readExtension3 <int> Reads are extended upstream by <int> bases from their
                      3' end.

  --read2pos <5:3>    Reduce reads to their 5' most base or 3' most base. Read
                      counting is then performed based on the single base the 
                      read is reduced to.

# Multi-mapping reads

  -M                  Multi-mapping reads will also be counted. For a multi-
                      mapping read, all its reported alignments will be 
                      counted. The 'NH' tag in BAM/SAM input is used to detect 
                      multi-mapping reads.

# Fractional counting

  --fraction          Assign fractional counts to features. This option must
                      be used together with '-M' or '-O' or both. When '-M' is
                      specified, each reported alignment from a multi-mapping
                      read (identified via 'NH' tag) will carry a fractional
                      count of 1/x, instead of 1 (one), where x is the total
                      number of alignments reported for the same read. When '-O'
                      is specified, each overlapping feature will receive a
                      fractional count of 1/y, where y is the total number of
                      features overlapping with the read. When both '-M' and
                      '-O' are specified, each alignment will carry a fractional
                      count of 1/(x*y).

# Read filtering

  -Q <int>            The minimum mapping quality score a read must satisfy in
                      order to be counted. For paired-end reads, at least one
                      end should satisfy this criteria. 0 by default.

  --splitOnly         Count split alignments only (ie. alignments with CIGAR
                      string containing 'N'). An example of split alignments is
                      exon-spanning reads in RNA-seq data.

  --nonSplitOnly      If specified, only non-split alignments (CIGAR strings do
                      not contain letter 'N') will be counted. All the other
                      alignments will be ignored.

  --primary           Count primary alignments only. Primary alignments are 
                      identified using bit 0x100 in SAM/BAM FLAG field.

  --ignoreDup         Ignore duplicate reads in read counting. Duplicate reads 
                      are identified using bit Ox400 in BAM/SAM FLAG field. The 
                      whole read pair is ignored if one of the reads is a 
                      duplicate read for paired end data.

# Strandness

  -s <int or string>  Perform strand-specific read counting. A single integer
                      value (applied to all input files) or a string of comma-
                      separated values (applied to each corresponding input
                      file) should be provided. Possible values include:
                      0 (unstranded), 1 (stranded) and 2 (reversely stranded).
                      Default value is 0 (ie. unstranded read counting carried
                      out for all input files).

# Exon-exon junctions

  -J                  Count number of reads supporting each exon-exon junction.
                      Junctions were identified from those exon-spanning reads
                      in the input (containing 'N' in CIGAR string). Counting
                      results are saved to a file named '<output_file>.jcounts'

  -G <string>         Provide the name of a FASTA-format file that contains the
                      reference sequences used in read mapping that produced the
                      provided SAM/BAM files. This optional argument can be used
                      with '-J' option to improve read counting for junctions.

# Parameters specific to paired end reads

  -p                  If specified, fragments (or templates) will be counted
                      instead of reads. This option is only applicable for
                      paired-end reads; single-end reads are always counted as
                      reads.

  -B                  Only count read pairs that have both ends aligned.

  -P                  Check validity of paired-end distance when counting read 
                      pairs. Use -d and -D to set thresholds.

  -d <int>            Minimum fragment/template length, 50 by default.

  -D <int>            Maximum fragment/template length, 600 by default.

  -C                  Do not count read pairs that have their two ends mapping 
                      to different chromosomes or mapping to same chromosome 
                      but on different strands.

  --donotsort         Do not sort reads in BAM/SAM input. Note that reads from 
                      the same pair are required to be located next to each 
                      other in the input.

# Number of CPU threads

  -T <int>            Number of the threads. 1 by default.

# Read groups

  --byReadGroup       Assign reads by read group. "RG" tag is required to be
                      present in the input BAM/SAM files.
                      

# Long reads

  -L                  Count long reads such as Nanopore and PacBio reads. Long
                      read counting can only run in one thread and only reads
                      (not read-pairs) can be counted. There is no limitation on
                      the number of 'M' operations allowed in a CIGAR string in
                      long read counting.

# Assignment results for each read

  -R <format>         Output detailed assignment results for each read or read-
                      pair. Results are saved to a file that is in one of the
                      following formats: CORE, SAM and BAM. See Users Guide for
                      more info about these formats.

  --Rpath <string>    Specify a directory to save the detailed assignment
                      results. If unspecified, the directory where counting
                      results are saved is used.

# Miscellaneous

  --tmpDir <string>   Directory under which intermediate files are saved (later
                      removed). By default, intermediate files will be saved to
                      the directory specified in '-o' argument.

  --maxMOp <int>      Maximum number of 'M' operations allowed in a CIGAR
                      string. 10 by default. Both 'X' and '=' are treated as 'M'
                      and adjacent 'M' operations are merged in the CIGAR
                      string.

  --verbose           Output verbose information for debugging, such as un-
                      matched chromosome/contig names.

  -v                  Output version of the program.

(rnaseq) root 12:25:45 ~

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末碉钠,一起剝皮案震驚了整個(gè)濱河市棉钧,隨后出現(xiàn)的幾起案子的圆,更是在濱河造成了極大的恐慌,老刑警劉巖竣付,帶你破解...
    沈念sama閱讀 206,839評(píng)論 6 482
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件惧互,死亡現(xiàn)場(chǎng)離奇詭異碱呼,居然都是意外死亡忽你,警方通過(guò)查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,543評(píng)論 2 382
  • 文/潘曉璐 我一進(jìn)店門(mén)确镊,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)士骤,“玉大人,你說(shuō)我怎么就攤上這事蕾域】郊。” “怎么了?”我有些...
    開(kāi)封第一講書(shū)人閱讀 153,116評(píng)論 0 344
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)巨缘。 經(jīng)常有香客問(wèn)我厢绝,道長(zhǎng),這世上最難降的妖魔是什么带猴? 我笑而不...
    開(kāi)封第一講書(shū)人閱讀 55,371評(píng)論 1 279
  • 正文 為了忘掉前任咸作,我火速辦了婚禮陨晶,結(jié)果婚禮上辞友,老公的妹妹穿的比我還像新娘躲舌。我一直安慰自己,他們只是感情好口予,可當(dāng)我...
    茶點(diǎn)故事閱讀 64,384評(píng)論 5 374
  • 文/花漫 我一把揭開(kāi)白布娄周。 她就那樣靜靜地躺著,像睡著了一般沪停。 火紅的嫁衣襯著肌膚如雪煤辨。 梳的紋絲不亂的頭發(fā)上,一...
    開(kāi)封第一講書(shū)人閱讀 49,111評(píng)論 1 285
  • 那天木张,我揣著相機(jī)與錄音众辨,去河邊找鬼。 笑死舷礼,一個(gè)胖子當(dāng)著我的面吹牛鹃彻,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播妻献,決...
    沈念sama閱讀 38,416評(píng)論 3 400
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼蛛株,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來(lái)了育拨?” 一聲冷哼從身側(cè)響起谨履,我...
    開(kāi)封第一講書(shū)人閱讀 37,053評(píng)論 0 259
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎熬丧,沒(méi)想到半個(gè)月后笋粟,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 43,558評(píng)論 1 300
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡锹引,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 36,007評(píng)論 2 325
  • 正文 我和宋清朗相戀三年矗钟,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片嫌变。...
    茶點(diǎn)故事閱讀 38,117評(píng)論 1 334
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡,死狀恐怖躬它,靈堂內(nèi)的尸體忽然破棺而出腾啥,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 33,756評(píng)論 4 324
  • 正文 年R本政府宣布倘待,位于F島的核電站疮跑,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏凸舵。R本人自食惡果不足惜祖娘,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,324評(píng)論 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望啊奄。 院中可真熱鬧渐苏,春花似錦、人聲如沸菇夸。這莊子的主人今日做“春日...
    開(kāi)封第一講書(shū)人閱讀 30,315評(píng)論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)庄新。三九已至鞠眉,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間择诈,已是汗流浹背械蹋。 一陣腳步聲響...
    開(kāi)封第一講書(shū)人閱讀 31,539評(píng)論 1 262
  • 我被黑心中介騙來(lái)泰國(guó)打工, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留羞芍,地道東北人朝蜘。 一個(gè)月前我還...
    沈念sama閱讀 45,578評(píng)論 2 355
  • 正文 我出身青樓,卻偏偏與公主長(zhǎng)得像涩金,于是被迫代替她去往敵國(guó)和親谱醇。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 42,877評(píng)論 2 345

推薦閱讀更多精彩內(nèi)容