GWAS培慌,SNP,和疾病

三種方法如何獲取snp信息

引用: http://www.bio-info-trainee.com/2100.html#more-2100

有研究表明STAT4上的rs7574865和HLA-DQ的 rs9275319是人群中乙型肝炎病毒(HBV)相關(guān)肝細胞癌(HCC)遺傳易感基因

意思是柑爸,某兩個位點變異導致乙型肝炎病毒和相關(guān)肝細胞癌發(fā)生的關(guān)鍵原因吵护。rsID分別代表兩個變異位點 (發(fā)現(xiàn)變異位點后通過vep/snpEFF對變異位點進行的注釋)。所以根據(jù)rsID能夠找到這個位點在基因組的位置表鳍∠诙可以用dnSNP來查看rsID的基因坐標。

方法一:
下載All_20160601.vcf.gz 這個文件(很大數(shù)據(jù)):

mkdir -p ~/annotation/variation/human/dbSNP
cd ~/annotation/variation/human/dbSNP
## https://www.ncbi.nlm.nih.gov/projects/SNP/
## ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh38p2/
## ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/
nohup wget ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/All_20160601.vcf.gz &
wget ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b147_GRCh37p13/VCF/All_20160601.vcf.gz.tbi

運行的時候有報錯:No such directory ‘snp/organisms/human_9606_b147_GRCh37p13/VCF’.

方法二:
也可以登錄網(wǎng)頁版本數(shù)據(jù)庫譬圣,直接修改 url(小量搜索):
https://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=7574865
https://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs9275319

方法三:
SNPedia瓮恭,直接修改url (優(yōu)點,搜集了非常多的其它數(shù)據(jù)庫的鏈接)
https://www.snpedia.com/index.php/Rs7574865
https://www.snpedia.com/index.php/Rs9275319


拓展:如何進行GWAS分析

方法一:
plink進行分析
這里是plink的官網(wǎng):https://www.cog-genomics.org/plink2/
plink做SNP篩選和GWAS
plink進行GWAS分析

方法二:
R包分析 (繪制曼哈頓圖)
Postgwas: Advanced GWAS Interpretation in R
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0071775


如何call SNP and indels

參考: http://blog.sina.com.cn/s/blog_83f77c940102w2eb.html


如何SNP過濾

引用: http://blog.sina.com.cn/s/blog_83f77c940102w2eg.html

  1. 缺失比例 (Missing rates)
    GENO>0.05

Shortly we will apply more stringent criteria, such that GENO > 0.05. In this case, 0.05*89 = 4.45 samples, meaning that if a SNP is missing in 4.45 more more samples, that SNP will be removed from the dataset.

89是全部sample數(shù)厘熟,89xGENO得到的閥值是4.45屯蹦,所以某個call的SNP在4樣品(或以下)里沒有出現(xiàn),保留绳姨;在5個樣本以上沒出現(xiàn)則刪掉登澜。

  1. 最小等位基因頻率 (Minor Allele frequencies)
    提示: MAF< 0.03 如果SNP較多可以設(shè)置為MAF<0.05

MAF is the Minor Allele Frequency. It can be used to exclude SNPs which are not informative because they show little variation in the sample set being analyzed. For instance, if a SNP shows variation in only 1 of the 89 individuals, it is not useful statistically and should be removed.

意思是,如果某一個SNP只出現(xiàn)在很少數(shù)樣品(< MAF x Total Number of samples)的時候飘庄,就需要移除

  1. Removing SNPs out of Hardy-Weinberg equilibrium(p-value > 10^6 - 10^4 ) 哈迪溫伯格平衡

Population genetic theory suggests that under ‘normal’ conditions, there is a predictable relationship between allele frequencies and genotype frequencies. In cases where the genotype distribution is different from what one would expect based on the allele frequencies, one potential explanation for this is genotyping error. Natural selection is another explanation. For this reason, we typically check for deviation from Hardy-Weinberg equilibrium in the controls for a case- control study. For a quantitative trait, PLINK just uses everyone. The following command generates p-values for deviation from HWE for each SNP. Low p-values indicate that a SNP is out of HWE.

  1. 由vcf文件進行SNP過濾
    運用vcftools轉(zhuǎn)換為plink的輸入形式帖渠,輸出 bed文件 (或者map文件),然后作為輸入進行過濾
vcftools --vcf my.vcf --plink --out plink

plink --noweb --file plink --geno 0.05 --maf 0.05 --hwe 0.0001 --make-bed --out QC

如果還不知道什么是GWAS竭宰?什么是SNP?這里是定義:

引用: http://www.biotrainee.com:8080/thread-1487-1-1.html
Genome-wide association studies (GWAS) 是指在人類全基因組範圍內(nèi)利用存在的序列變異份招,即單核苷酸多型性(SNP)切揭,並從中篩選出與疾病相關(guān)的SNPs。

  • 哪些疾病與SNP有關(guān)呢锁摔?
    近些年廓旬,全基因組關(guān)聯(lián)分析方法(Genome-Wide Association Study,簡稱GWAS)利用大群體和高密度SNP(Single Nucleotide Polymorphism谐腰,單核苷酸多態(tài))分子標記已經(jīng)定位到了上千個與復雜疾病關(guān)聯(lián)的SNP位點孕豹,而且這些關(guān)聯(lián)信號在多次試驗中有很高的可重復性。比如人類常見疾病肥胖十气,糖尿病励背,精神分裂等。
  • SNP的誤差因素砸西?
    由于隨機采樣帶來到抽樣誤差(這在現(xiàn)實中無法避免)以及SNP之間復雜的連鎖不平衡(linkage disequilibrium, 簡稱LD)叶眉,GWAS定位到的SNP位點通常不是致病位點址儒。

2016年發(fā)表在PLOS-one上的文章,介紹SNP與骨關(guān)節(jié)炎衅疙。
雖然不是很牛的雜志莲趣,但是文章質(zhì)量很好。

Functional Characterization of the Osteoarthritis Susceptibility Mapping to CHST11—A Bioinformatics and Molecular Study

根據(jù)標題可以知道饱溢,是對Osteoarthritis疾病的研究喧伞,針對的目標基因是CHST11,Carbohydrate sulfotransferase 11 is an enzyme that in humans is encoded by the CHST11 糖-磺基轉(zhuǎn)移酶 (不知道具體翻譯绩郎,請(生)化學大神指教)潘鲫。基因位置 是 chr12: 104,455,295-104,762,014 (GRCh38)嗽上。CHST11的功能研究次舌,英國劍橋的桑格研究所有做過該基因敲除的小鼠,Chst11^tm1a(KOMP)Wtsi 兽愤。這個基因主要與骨頭和軟骨的表型phenotyping有關(guān)系彼念。小鼠的表型研究里發(fā)現(xiàn)異常:Homozygous viability at P14

2012年柳葉刀里也有文章說這個基因突變會導致浅萧,骨關(guān)節(jié)炎逐沙,這個雜志就不用說有多厲害了。

Identification of new susceptibility loci for osteoarthritis (arcOGEN): a genome-wide association

接下來分別看一下這兩篇文章洼畅,和這個基因吩案,以及這個基因的SNP,以及對其功能分析上的研究與闡述帝簇。

(一) 骨關(guān)節(jié)炎的背景:

什么是OA徘郭?

(1)Osteoarthritis (OA) is a common disease of older individuals that is characterized by the focal(病灶點) loss of articular cartilage. This loss usually occurs gradually over many years and typically results in chronic pain and severely impaired joint function by the sixth or seventh decade of life.

(2)Osteoarthritis is the most common form of arthritis worldwide and is a major cause of pain and disability in elderly people.

genetics上OA的特點?

(1)OA is polygenic and unlike many other common arthritic diseases, there are no OA risk- conferring loci of large singular impact
(2)It is a complex disease of the musculoskeletal system with both genetic and environmental risk factors. From the results of heritability studies in twins, sibling pairs, and families, genetic factors are estimated to account for about 50% of the risk of developing osteoarthritis in the hip or knee, although precise estimates vary according to sex, affected site, and severity of disease.

(二)研究方法:

(1)偏重功能分析

  • Identification of SNPs in LD with rs835487
  • Identification of Sequences Homologous to the Enhancer in Non-Human Mammals
  • Cloning of pGL3-Promoter Luciferase Reporter Plasmids
  • Transfection of Cell Lines
  • Electrophoretic Mobility Shift Assays (EMSAs)
  • Ethics Statement, Cartilage Collection and Nucleic Acid Extraction
  • Gene Expression, Genotyping and AEI Analysis
  • Chondrogenic Differentiation of MSCs

(2)偏重分析

  • We undertook a large genome-wide association study (GWAS) in 7,410 unrelated and retrospectively and prospectively selected patients with severe osteoarthritis in the arcOGEN study, 80% of whom had undergone total joint replacement, and 11,009 unrelated controls from the UK. We replicated the most promising signals in an independent set of up to 7,473 cases and 42,938 controls, from studies in Iceland, Estonia, the Netherlands, and the UK. All patients and controls were of European descent.

(三)結(jié)論

(1)rs835487 (allele G; THR) located within intron two of CHST11 is associated with hip OA

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末丧肴,一起剝皮案震驚了整個濱河市残揉,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌芋浮,老刑警劉巖抱环,帶你破解...
    沈念sama閱讀 217,185評論 6 503
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異纸巷,居然都是意外死亡镇草,警方通過查閱死者的電腦和手機,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,652評論 3 393
  • 文/潘曉璐 我一進店門瘤旨,熙熙樓的掌柜王于貴愁眉苦臉地迎上來梯啤,“玉大人,你說我怎么就攤上這事裆站√醣伲” “怎么了黔夭?”我有些...
    開封第一講書人閱讀 163,524評論 0 353
  • 文/不壞的土叔 我叫張陵,是天一觀的道長羽嫡。 經(jīng)常有香客問我本姥,道長,這世上最難降的妖魔是什么杭棵? 我笑而不...
    開封第一講書人閱讀 58,339評論 1 293
  • 正文 為了忘掉前任婚惫,我火速辦了婚禮,結(jié)果婚禮上魂爪,老公的妹妹穿的比我還像新娘先舷。我一直安慰自己,他們只是感情好滓侍,可當我...
    茶點故事閱讀 67,387評論 6 391
  • 文/花漫 我一把揭開白布蒋川。 她就那樣靜靜地躺著,像睡著了一般撩笆。 火紅的嫁衣襯著肌膚如雪捺球。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 51,287評論 1 301
  • 那天夕冲,我揣著相機與錄音氮兵,去河邊找鬼。 笑死歹鱼,一個胖子當著我的面吹牛泣栈,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播弥姻,決...
    沈念sama閱讀 40,130評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼南片,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了庭敦?” 一聲冷哼從身側(cè)響起铃绒,我...
    開封第一講書人閱讀 38,985評論 0 275
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎螺捐,沒想到半個月后,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體矮燎,經(jīng)...
    沈念sama閱讀 45,420評論 1 313
  • 正文 獨居荒郊野嶺守林人離奇死亡定血,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 37,617評論 3 334
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發(fā)現(xiàn)自己被綠了诞外。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片澜沟。...
    茶點故事閱讀 39,779評論 1 348
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖峡谊,靈堂內(nèi)的尸體忽然破棺而出茫虽,到底是詐尸還是另有隱情刊苍,我是刑警寧澤,帶...
    沈念sama閱讀 35,477評論 5 345
  • 正文 年R本政府宣布濒析,位于F島的核電站正什,受9級特大地震影響,放射性物質(zhì)發(fā)生泄漏号杏。R本人自食惡果不足惜婴氮,卻給世界環(huán)境...
    茶點故事閱讀 41,088評論 3 328
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望盾致。 院中可真熱鬧主经,春花似錦、人聲如沸庭惜。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,716評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽护赊。三九已至惠遏,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間百揭,已是汗流浹背爽哎。 一陣腳步聲響...
    開封第一講書人閱讀 32,857評論 1 269
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留器一,地道東北人课锌。 一個月前我還...
    沈念sama閱讀 47,876評論 2 370
  • 正文 我出身青樓,卻偏偏與公主長得像祈秕,于是被迫代替她去往敵國和親渺贤。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 44,700評論 2 354

推薦閱讀更多精彩內(nèi)容