Author:ligc
Date:19/5/15
下載參考基因組
我們的實(shí)驗(yàn)對(duì)象是小鼠(Mus musculus),所以進(jìn)入U(xiǎn)CSC官網(wǎng)下載小鼠的基因組文件.
UCSC
mm10
下載基因組注釋文件
進(jìn)入gencode官網(wǎng)https://www.gencodegenes.org/下載小鼠基因組對(duì)應(yīng)的gtf或gff文件
image.png
GTF
GTF(General Transfer Format)分為如下幾列:
- seqname - name of the chromosome or scaffold; chromosome names can be given with or without the 'chr' prefix. Important note: the seqname must be one used within Ensembl, i.e. a standard chromosome name or an Ensembl identifier such as a scaffold ID, without any additional content such as species or assembly. See the example GFF output below.
- source - name of the program that generated this feature, or the data source (database or project name)
- feature - feature type name, e.g. Gene, Variation, Similarity
- start - Start position of the feature, with sequence numbering starting at 1.
- end - End position of the feature, with sequence numbering starting at 1.
- score- A floating point value.
- strand - defined as + (forward) or - (reverse).
- frame - One of '0', '1' or '2'. '0' indicates that the first base is a codon, '1' that the second base is the first base of a codon, and so on..
-
attribute - A semicolon-separated list of tag-value pairs, providing additional information about each feature.
GFF3
GFF3(General Feature Format)的格式如下:
- seqid - name of the chromosome or scaffold; chromosome names can be given with or without the 'chr' prefix. Important note: the seq ID must be one used within Ensembl, i.e. a standard chromosome name or an Ensembl identifier such as a scaffold ID, without any additional content such as species or assembly. See the example GFF output below.
- source - name of the program that generated this feature, or the data source (database or project name)
- type - type of feature. Must be a term or accession from the SOFA sequence ontology
- start - Start position of the feature, with sequence numbering starting at 1.
- end - End position of the feature, with sequence numbering starting at 1.
- score - A floating point value.
- strand - defined as + (forward) or - (reverse).
- phase - One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on..
- attributes - A semicolon-separated list of tag-value pairs, providing additional information about each feature. Some of these tags are predefined, e.g. ID, Name, Alias, Parent - see the GFF documentation for more details.
Integrative Genomics Viewer (IGV)
https://software.broadinstitute.org/software/igv/
The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets.