ANNOVAR注釋會(huì)用到的refseq文件(bed文件格式灭翔,左閉右開头滔,0 based坐標(biāo)系統(tǒng))的各列含義,最后幾列經(jīng)常忘記沐鼠,備案一下(如果有看到的挚瘟,忽略渣渣英文,能懂就好):
? ? 1: bin , indexing field to speed chromosome range queries
? ? 2: name (NM ID)
? ? 3: chr
? ? 4: strand
? ? 5: transcription start
? ? 6: transcription end
? ? 7: translation(CDS) start
? ? 8: translation(CDS) end
? ? 9: number of exon
? ? 10: every exon start
? ? 11: every exon end
? ? 12: score
? ? 13: name2 (gene name)
? ? 14: cdsStartStat, enum('none','unk','incmpl','cmpl')
? ? 15: cdsEndStat, enum('none','unk','incmpl','cmpl')
? ? 16: exonFrames, exon frame {0, 1, 2}, or -1 if no frame for exon
對于14饲梭,15列的'none','unk','incmpl','cmpl'含義:
? ? "none" - No CDS (non-coding)
? ? "unk" - CDS is unknown (coding, but not known)
? ? "incmpl" - CDS is not complete at this end
? ? "cmpl" - CDS is complete at this end
對于16列的數(shù)字含義及舉例(注意轉(zhuǎn)錄本有正反鏈的區(qū)別乘盖,下面只是舉例,沒有考慮正反鏈):
exonFrames:? the exonFrames field tells you how the two exons join together.
-1 means that the exon isentirely UTR. When the nucleotides in two exons are
required to? form an amino acid together, the number is expressed as the
number of nucleotides in the first exon. Because one amino acid contains
3 nucleotide, so the max number in the first exon will be 2.
exonFrames example: there is 1 nucleotide at the end of exon1 that joins
with the first two nucleotides at the start of exon2, this means that exon2
picks up one nucleotide from the exon1 to make the amino acid, the nucleotide
number from the first exon is 1.
用中文再說一遍:這列用于表現(xiàn)不同外顯子之間是如何組合在一起形成氨基酸的排拷。當(dāng)外顯子完全是UTR的時(shí)候侧漓,這個(gè)值為 -1。當(dāng)外顯子包含CDS的序列的時(shí)候监氢,如果不同外顯子上的堿基布蔗,需要組合在一起形成氨基酸藤违,會(huì)出現(xiàn)形成這個(gè)氨基酸 ,需要從前一個(gè)外顯子獲得的堿基個(gè)數(shù)的值纵揍。比如外顯子2上開頭的兩個(gè)堿基顿乒,需要和外顯子1的末尾一個(gè)堿基,組合在一起形成了一個(gè)氨基酸泽谨,這個(gè)值就是1璧榄。并且由于一個(gè)氨基酸就包含3個(gè)堿基,所以跨外顯子最大的堿基貢獻(xiàn)數(shù)就是2吧雹,也就是這個(gè)值最大就是2骨杂。
tips: hg19_refgene.txt is bed format, means the value is close at left and open at right, and 1 less than the actually coordinate on reference genome coordinate.
參考資料: