HRD score = LOH + TAI + LST
參考:Sztupinszki et al, Migrating the SNP array-based homologous recombination deficiency measures to next generation sequencing data of breast cancer, npj Breast Cancer, https://www.nature.com/articles/s41523-018-0066-6.
R package: scarHRD
https://github.com/sztup/scarHRD#introduction
第1步最關(guān)鍵,即得到 input file。
一可训、嘗試Sequenza
根據(jù)sequenza說(shuō)明書(shū)沫勿,需要bam file。梯找。比較難獲得。而且,需要使用python获枝,俺不會(huì)。
附可參考的網(wǎng)頁(yè):
- Sequenza User Guide
https://rdrr.io/cran/sequenza/f/vignettes/sequenza.Rmd - TCGA RNAseq BAM File
http://seqanswers.com/forums/showthread.php?t=65176 - TCGA_bam_splicer
https://freesoft.dev/program/131953985 - bam 格式文件
https://blog.csdn.net/qq_36608036/article/details/104630366
二骇笔、嘗試ASCAT
參考: ASCAT (Van Loo et al. 2010)
https://github.com/VanLoo-lab/ascat
先跑一下包里的ExampleData
library(ASCAT)
ascat.bc = ascat.loadData("Tumor_LogR.txt","Tumor_BAF.txt","Germline_LogR.txt","Germline_BAF.txt")
ascat.plotRawData(ascat.bc)
ascat.bc = ascat.aspcf(ascat.bc)
ascat.plotSegmentedData(ascat.bc)
ascat.output = ascat.runAscat(ascat.bc)
ascat.output$nA
ascat.output$nB
ascat.output$ploidy
ascat.output$aberrantcellfraction
目標(biāo):跑出下圖的數(shù)據(jù)
很可惜GitHub里的readme寫(xiě)的不是很仔細(xì)省店,manual.pdf不見(jiàn)了,所以只能閱讀原文 ASCAT (Van Loo et al. 2010)笨触,來(lái)破解參數(shù)的含義懦傍。
ASCAT profiles: genome-wide allele-specific copy number profiles
左圖:ASCAT首先確定腫瘤細(xì)胞的倍性ploidy 和異常細(xì)胞分?jǐn)?shù)fraction of aberrant cells。然后評(píng)估 goodness of fit for a grid of possible values for both parameters (blue, good solution)芦劣,選擇最佳的solution粗俱,即綠色交叉點(diǎn),例如A圖的左邊 綠色交叉點(diǎn)對(duì)應(yīng)ploidy=1.77和fraction of aberrant cells=80%
右上圖:x軸表示genomic location虚吟,y軸 CN(其中綠色是allele with lowest copy number寸认,紅色是allele with highest copy number)
右下圖: an aberration reliability score異常細(xì)胞可靠性分?jǐn)?shù)
- 何為fit娱俺?
(A) Frequency of LOH across the genome. Probes are shown in
genomic order along the x axis, from chromosome 1 to chromosome X, where different chromosomes are delimited by gray lines.
(B) Frequency of copy number neutral events across the genome. For diploid tumors, copy number-neutral events correspond to a subset of LOH (copy number-neutral LOH), but for, for example, tetraploid tumors, a copy number neutral event can also be three copies of A and one copy of B.
- 何為L(zhǎng)OH?
- 何為copy number neutral event 废麻?
LOH:Loss of heterozygosity (LOH) was defined as the number of counts of chromosomal LOH regions shorter than whole chromosome and longer than 15 Mb 荠卷。
Copy number neutral event :Copy number正常,但存在allelic bias烛愧。
Illumina SNP arrays deliver two output tracks:** Log R, a measure of total signal intensity,** and B allele frequency (BAF), a measure of allelic contrast.
The Log R track is similar to the output given by common array-CGH platforms and quantifies the (total) copy number of each genomic locus.
The BAF track shows the relative presence of each of the two alternative nucleotides (called “A” and “B”) at each SNP locus profiled.
- 為了得到LRR和BAF油宜,還是逃不掉處理CEL文件嗎?
-end-