What are variants, alleles and haplotypes?
What are variants?
In the field of genetic variation, the term variant is used to refer to a specific region of the genome which differs between two genomes.
通常我們所說的變異其實是相對于參考基因組來說的转晰。如果沒有這個參考基因組芦拿,何談變異士飒。按照這個定義,變異其實說的是兩個基因組不同的區(qū)域蔗崎。那么酵幕,單核苷酸變異說的不同人之間單個base的變異』嚎粒可以叫做snv (single-nucleotide variant))芳撒,但是還不能被叫snp (single-nucleotide polymorphism) 。
snv和snp之間區(qū)別請移步公眾號搜索“理清SNP未桥、SNV笔刹、CNV等一些概念”。
snp和snv很多時候被混著叫冬耿。snp的概念重心在于多態(tài)性舌菜,描述的是一個群體性的概念。這個單位點的變異存在于一個群體之中亦镶,當然這個頻率是多少不一定日月。
snv描述的是少數(shù)個體之間,相對于參考基因組缤骨,某個位點的變異爱咬。
What are alleles?
那么知道snp后,后面一個概念是allele绊起,翻譯過來是等位基因台颠。
關(guān)于等位基因概念allele請移步公眾號搜索“一葉知因丨基因科普微視頻之Allele”。
說簡單點勒庄,就是同源染色體上串前,同一個基因座上的一對基因。一個來自父親实蔽,一個來自母親荡碾。當然,等位基因可能完全一樣局装,也可能部分序列不一樣坛吁。
這個和我們遺傳學上課本學的等位基因的概念還有些不同。
進行群體的重測序后铐尚,我們要call snp拨脉。
按照以下EBI的描述,說的是同一個snp位置宣增,參考基因組那個堿基就是reference allele玫膀;你的重測序個體那個snp就是alternative allele。
那么爹脾,接下來就引出另一個概念帖旨,snp二態(tài)性(snp biallelic)箕昭。
雖然base有acgt四種,但是對于基因組某個位置確定的堿基只有一個解阅,也就是說具體到某個snp其實就是兩種形態(tài)落竹,比如C/T。他們非此即彼货抄。
有了snp allele的概念述召,就很容易理解haplotype了。翻譯為單倍型蟹地。簡單理解就是一條染色體上桨武,連續(xù)的一段snp。下圖清晰地展示了haplotype锈津。AGT/GTA/AGA是三種不同類型的單倍型呀酸。
那么問題來了,為了什么要區(qū)分不同的單倍型琼梆?
圖片來自:https://sg.idtdna.com/pages/education/decoded/article/genotyping-terms-to-know
Different versions of the same variant are called alleles. For example, a SNP may have two alternative bases, or alleles, C and T4.
When working with genome scale data the term reference allele refers to the base that is found in the reference genome. Since the reference is just somebody’s genome, it is not always the major allele. In contrast, the alternative allele refers to any base, other than the reference, that is found at that locus. The alternative allele is not necessarily the minor allele and it may, or may not, be linked to a phenotype. There can be more than one alternative allele per variant.
What is linkage disequilibrium?
In the genome, alleles at variants close together on the same chromosome tend to occur together more often than is expected by chance. These blocks of alleles are called haplotypes. Linkage disequilibrium (LD) is a measure of how often two alleles or specific sequences are inherited together, with alleles that are always co-inherited said to be in linkage disequilibrium.
不知道理解到位不性誉?后續(xù)再理解,加點圖
未完茎杂,待續(xù)