5月week4 文獻(xiàn)閱讀:Concept and benchmarks for assessing narrow‐sense validity of genetic risk score values
Concept and benchmarks for assessing narrow‐sense validity of genetic risk score values
評(píng)估遺傳風(fēng)險(xiǎn)評(píng)分值狹義效度的概念和基準(zhǔn)
Abstract
-
Background: While higher genetic risk score (GRS) has been statistically associated with increased disease risk (broad‐sense validity), the concept and tools for assessing the validity of reported GRS values from tests (narrow‐sense validity) are underdeveloped
背景:雖然較高的遺傳風(fēng)險(xiǎn)評(píng)分(GRS)在統(tǒng)計(jì)上與疾病風(fēng)險(xiǎn)的增加有關(guān)(廣義效度)咆繁,但用于評(píng)估從測(cè)試中報(bào)告的GRS值的有效性的概念和工具(狹義效度)尚不完善
-
Methods: We propose two benchmarks for assessing the narrow‐sense validity of GRS.
方法:我們提出了兩個(gè)評(píng)估廣義相對(duì)論狹義效度的基準(zhǔn)筹误。
-
The baseline benchmark requires that the mean GRS value in a general population approximates 1.0.
基準(zhǔn)測(cè)試要求一般人群的平均GRS值接近1.0。
-
The calibration benchmark assesses the agreement between observed risks and estimated risks (GRS values).
校準(zhǔn)基準(zhǔn)評(píng)估觀察到的風(fēng)險(xiǎn)與估計(jì)風(fēng)險(xiǎn)(GRS值)之間的一致性谋减。
-
We assessed benchmark performance for three prostate cancer (PCa) GRS tests, derived from three SNP panels with increasing stringency of selection criteria,in a PCa chemoprevention trial where 714 of 3225 men were diagnosed with PCa during the 4‐year follow‐up
我們?cè)u(píng)估了三種前列腺癌(PCa) GRS測(cè)試的基準(zhǔn)性能灿意,這些測(cè)試來自三個(gè)SNP面板估灿,選擇標(biāo)準(zhǔn)越來越嚴(yán)格,在前列腺癌化學(xué)預(yù)防試驗(yàn)中缤剧,3225名男性中有714人在4年的隨訪中被診斷出患有前列腺癌
-
Results: GRS from Panels 1, 2, and 3 were all statistically associated with PCa risk;P = 5.58 × 10?3, P =1×10?3, and P = 1.5 × 10?13, respectively (broad‐sense validity).
結(jié)果:組1馅袁、2、3的GRS均與PCa風(fēng)險(xiǎn)有統(tǒng)計(jì)學(xué)意義;P = 5.58×10?3,P =1×10?3,P = 1.5×10?13(廣義效度)荒辕。
-
For narrow‐sense validity, the mean GRS value among men without PCa was 1.33, 1.09, and 0.98 for Panels 1, 2, and 3, respectively (baseline benchmark).
對(duì)于狹義效度汗销,沒有PCa的男性中犹褒,組1、2和3的平均GRS值分別為1.33弛针、1.09和0.98(基線基準(zhǔn))叠骑。
-
For assessing the calibration benchmark, observed risks were calculated for seven groups of men with GRS values <0.3, 0.3–0.79, 0.8–1.19, 1.2‐1.49, 1.5‐1.99, 2‐2.99, and ≥3.
為了評(píng)估校準(zhǔn)基準(zhǔn),計(jì)算了7組GRS值<0.3削茁、0.3 - 0.79宙枷、0.8-1.19、1.2‐1.49茧跋、1.5‐1.99慰丛、2‐2.99和≥3的男性的觀察風(fēng)險(xiǎn)。
-
The calibration slope (higher is better) was 0.15, 0.12, and 0.60, and the bias score (lower is better) between the observed risks and GRS values was 0.08, 0.08, and 0.02 for Panels 1, 2, and 3, respectively
校準(zhǔn)斜率(越高越好)分別為0.15瘾杭、0.12和0.60诅病,組1、組2和組3的觀測(cè)風(fēng)險(xiǎn)與GRS值之間的偏差值(越低越好)分別為0.08富寿、0.08和0.02
-
Conclusion: Performance differed considerably among GRS tests.
結(jié)論:GRS檢測(cè)結(jié)果存在較大差異睬隶。
-
We recommend that all GRS tests be evaluated using the two benchmarks before clinical implementation for individual risk assessment.
我們建議所有的GRS測(cè)試在臨床實(shí)施前使用兩個(gè)基準(zhǔn)進(jìn)行評(píng)估,以進(jìn)行個(gè)體風(fēng)險(xiǎn)評(píng)估页徐。
INTRODUCTION
-
Genome‐wide association studies (GWAS) have identified thousands of risk‐associated SNPs for many common diseases including cancer.
全基因組關(guān)聯(lián)研究(GWAS)已經(jīng)確定了包括癌癥在內(nèi)的許多常見疾病的數(shù)千個(gè)風(fēng)險(xiǎn)相關(guān)snp。
-
Individually, these SNPs have a moderate effect on disease risk, with odds ratios (OR) typically ranging from 1.1‐1.5.
單獨(dú)來看银萍,這些單核苷酸多態(tài)性對(duì)疾病風(fēng)險(xiǎn)有中度影響变勇,優(yōu)勢(shì)比(OR)通常在1.1‐1.5之間。
-
However, when more than one risk‐associated SNP is inherited, they have a cumulative and clinically significant effect on disease risk.
然而贴唇,當(dāng)超過一個(gè)風(fēng)險(xiǎn)相關(guān)的SNP遺傳時(shí)搀绣,它們對(duì)疾病風(fēng)險(xiǎn)具有累積和臨床顯著的影響。
-
Polygenic risk score (PRS) is a method to measure the cumulative effect of multiple risk‐ associated SNPs.
多基因風(fēng)險(xiǎn)評(píng)分(PRS)是一種測(cè)量多個(gè)風(fēng)險(xiǎn)相關(guān)snp累積效應(yīng)的方法戳气。
-
PRS can be calculated in various ways, including a direct risk allele count, an OR‐weighted risk allele count, or using the latter approach but with population‐standardization, typically termed as genetic risk score (GRS).
PRS可通過多種方法計(jì)算链患,包括直接風(fēng)險(xiǎn)等位基因計(jì)數(shù)、或加權(quán)風(fēng)險(xiǎn)等位基因計(jì)數(shù)瓶您,或使用后一種方法麻捻,但采用群體標(biāo)準(zhǔn)化,通常稱為遺傳風(fēng)險(xiǎn)評(píng)分(GRS)呀袱。
The mean of score from the first two methods will vary with different numbers of risk‐associated SNPs used in the calculation.
前兩種方法的得分平均值將隨著計(jì)算中使用的不同數(shù)量的風(fēng)險(xiǎn)相關(guān)snp而變化贸毕。
-
In contrast, because GRS is multiplied by all SNP and each SNP is standardized against the general population, its expected mean in the general population will always be 1.0 regardless of the numbers of SNPs used in the calculation.
相比之下,由于GRS乘以所有SNP夜赵,并且每個(gè)SNP都是針對(duì)一般人群進(jìn)行標(biāo)準(zhǔn)化的明棍,因此無論計(jì)算中使用的SNP數(shù)量如何,一般人群中的期望均值始終為1.0寇僧。
-
Furthermore, GRS values can be simply interpreted as a relative risk to the general population.
此外摊腋,GRS值可以簡(jiǎn)單地解釋為相對(duì)于一般人群的風(fēng)險(xiǎn)沸版。
-
These two important features of GRS make it easy to implement for individual risk assessment.
GRS的這兩個(gè)重要特性使得個(gè)體風(fēng)險(xiǎn)評(píng)估易于實(shí)現(xiàn)。
(GRS :加權(quán)風(fēng)險(xiǎn)等位基因計(jì)數(shù),用群體標(biāo)準(zhǔn)化兴蒸,通常稱為遺傳風(fēng)險(xiǎn)評(píng)分(GRS),由于GRS乘以所有SNP视粮,并且每個(gè)SNP都是針對(duì)一般人群進(jìn)行標(biāo)準(zhǔn)化的,因此無論計(jì)算中使用的SNP數(shù)量如何类咧,一般人群中的期望均值始終為1.0,GRS值可以簡(jiǎn)單地解釋為相對(duì)于一般人群的風(fēng)險(xiǎn)馒铃。GRS的這兩個(gè)重要特性使得個(gè)體風(fēng)險(xiǎn)評(píng)估易于實(shí)現(xiàn))
-
Higher GRS values have been consistently associated with an increased risk for many common diseases, including cancer and cardiovascular diseases.
高GRS值一直與許多常見疾病的風(fēng)險(xiǎn)增加有關(guān),包括癌癥和心血管疾病痕惋。
-
In prostate cancer (PCa) for example, a significant dose‐response effect between GRS percentiles (quartile, quintile, or deciles) and disease risk was consistently observed in many study populations, including large case‐control studies retrospective analysis of prospective studies prostate biopsy cohorts and prospective studies.
例如区宇,在前列腺癌(PCa)中,GRS百分位數(shù)(四分位數(shù)值戳、五分位數(shù)或十分位數(shù))與疾病風(fēng)險(xiǎn)之間存在顯著的劑量反應(yīng)效應(yīng)议谷,這在許多研究人群中得到了一致的觀察,包括大型病例對(duì)照研究堕虹、前瞻性研究前列腺活檢組和前瞻性研究的回顧性分析卧晓。
These statistical associations provide an important basis for risk assessment, which we refer to as broad‐sense validity.
這些統(tǒng)計(jì)關(guān)聯(lián)為風(fēng)險(xiǎn)評(píng)估提供了重要依據(jù),我們稱之為廣義效度赴捞。
-
Broad‐sense validity is necessary but insufficient to warrant GRS testing for individual risk assessment.
廣義效度是必要的逼裆,但不足以保證GRS測(cè)試用于個(gè)體風(fēng)險(xiǎn)評(píng)估。
(RRS的特性以及不足:廣義效度是必要的赦政,但不足以保證GRS測(cè)試用于個(gè)體風(fēng)險(xiǎn)評(píng)估)
-
To offer GRS testing at an individual level, the validity of any reported GRS values from tests (which we refer to as narrow‐sense validity) must be met for several reasons.
為了在個(gè)體水平上提供GRS測(cè)試胜宇,必須滿足來自測(cè)試的任何報(bào)告的GRS值(我們稱之為狹義效度)的有效性。
-
First, in individual testing, test subjects receive their GRS values, not the percentiles of GRS, that are determined on the basis of a study cohort.
首先恢着,在個(gè)體測(cè)試中桐愉,測(cè)試對(duì)象收到的是他們的GRS值,而不是GRS的百分位數(shù)掰派,這是根據(jù)研究隊(duì)列確定的从诲。
-
Second, GRS values, not percentiles, are used directly to estimate an individuals' relative and absolute disease risk including lifetime risk.
第二,GRS值靡羡,而不是百分位數(shù)系洛,直接用于估計(jì)個(gè)人的相對(duì)和絕對(duì)疾病風(fēng)險(xiǎn),包括終生風(fēng)險(xiǎn)亿眠。
-
Third, the validity of reported GRS values is uncertain as they can be affected by many factors in the test design, including which SNPs to be used in calculating GRS, independence among SNPs, assumption of their additive effect, as well as estimates of their OR and allele frequency
第三碎罚,報(bào)告的GRS值的有效性是不確定的,因?yàn)樵跍y(cè)試設(shè)計(jì)中纳像,它們可能受到許多因素的影響荆烈,包括計(jì)算GRS時(shí)使用哪些SNPs、SNPs之間的獨(dú)立性、它們的相加效應(yīng)的假設(shè)以及它們的OR和等位基因頻率的估計(jì)
(個(gè)體風(fēng)險(xiǎn)評(píng)估的GRS 需要滿足的特性)
-
To date, the concept of narrow‐sense validity has been under-appreciated and not widely pursued.
到目前為止憔购,狹義效度的概念一直沒有得到足夠的重視宫峦,也沒有得到廣泛的應(yīng)用。
-
Furthermore, although methods for measuring calibration of prediction models are well developed, they have not been adopted for assessing the narrow‐sense validity of GRS (or other PRSs).
此外玫鸟,雖然測(cè)量校準(zhǔn)預(yù)測(cè)模型的方法已經(jīng)發(fā)展得很好导绷,但它們還沒有被用于評(píng)估廣義相對(duì)論(或其他廣義相對(duì)論)的狹義有效性。
-
Most existing calibration methods assess the agreement between observed risk and predicted probabilities (or risks) derived from regression models of GRS in study populations, not the absolute predicted risks (GRS values) per se.
大多數(shù)現(xiàn)有的校準(zhǔn)方法評(píng)估的是研究人群中GRS回歸模型得出的觀察風(fēng)險(xiǎn)與預(yù)測(cè)風(fēng)險(xiǎn)(或風(fēng)險(xiǎn))之間的一致性屎飘,而不是絕對(duì)預(yù)測(cè)風(fēng)險(xiǎn)(GRS值)本身妥曲。
-
Following the well‐established framework for assessing the performance of a prediction model,14 we propose two benchmarks that specifically assess the performance of reported GRS values from tests.
根據(jù)已建立的評(píng)估預(yù)測(cè)模型性能的框架,我們提出了兩個(gè)基準(zhǔn)钦购,專門評(píng)估從測(cè)試中報(bào)告的GRS值的性能檐盟。
(根據(jù)已建立的評(píng)估預(yù)測(cè)模型性能的框架,提出了兩個(gè)基準(zhǔn)押桃,測(cè)試中報(bào)告的GRS值的性能(這二個(gè)基準(zhǔn)可以評(píng)估評(píng)估廣義相對(duì)論的狹義有效性))
-
The first benchmark requires that the mean GRS in a general population approximates value of 1.0 (baseline benchmark).
第一個(gè)基準(zhǔn)要求總體中的平均GRS接近1.0(基線基準(zhǔn))葵萎。
-
This is a theoretical expectation on the basis of the GRS calculation and is a minimum requirement for GRS as a valid risk measurement tool.
這是基于GRS計(jì)算的理論預(yù)期,也是GRS作為有效風(fēng)險(xiǎn)度量工具的最低要求唱凯。
-
The second benchmark assesses the agreement between observed risks and reported GRS values (calibration benchmark).
第二個(gè)基準(zhǔn)評(píng)估觀察到的風(fēng)險(xiǎn)與報(bào)告的GRS值(校準(zhǔn)基準(zhǔn))之間的一致性羡忘。
-
The performance of this benchmark can be assessed using a calibration plot (observed risk and predicted risk expressed as GRS values) and two measure- ments of calibration (correlation and agreement).
該基準(zhǔn)的性能可以使用校準(zhǔn)圖(以GRS值表示的觀測(cè)風(fēng)險(xiǎn)和預(yù)測(cè)風(fēng)險(xiǎn))和校準(zhǔn)的兩個(gè)度量(相關(guān)性和一致性)來評(píng)估。
-
The correlation measurement can be estimated using a calibration slope, and the agreement measurement can be estimated using a bias score.
相關(guān)測(cè)量可以使用校準(zhǔn)斜率進(jìn)行估計(jì)磕昼,一致性測(cè)量可以使用偏差評(píng)分進(jìn)行估計(jì)获黔。
-
The higher calibration slope and lower bias score between the observed risk and reported GRS values indicate a better calibration.
在觀察到的風(fēng)險(xiǎn)值和報(bào)告的GRS值之間纤房,校準(zhǔn)斜率越高懊昨,偏差值越低急波,表明校準(zhǔn)效果更好。
(基準(zhǔn)的詳細(xì)介紹)
-
As a demonstration, we assessed both the baseline and calibration benchmarks for three different PCa GRS tests, derived from three SNP panels, in an existing clinical trial population, REduction by DUtasteride of PCa events (REDUCE).
為了證明這一點(diǎn)纫骑,我們?cè)诂F(xiàn)有的臨床試驗(yàn)人群中評(píng)估了三種不同的PCa GRS測(cè)試的基線和校準(zhǔn)基準(zhǔn)(來自三個(gè)SNP面板),使用達(dá)那雄胺減少PCa事件(REDUCE)九孩。
-
We showed that some GRS tests (Panels 1 and 2) had poor calibration;
我們發(fā)現(xiàn)一些GRS測(cè)試(面板1和面板2)的校準(zhǔn)效果很差;
-
the observed risks differed considerably from estimated risks.
觀察到的風(fēng)險(xiǎn)與估計(jì)的風(fēng)險(xiǎn)有很大差異先馆。
-
If such GRS tests were used for risk assessment, the PCa risk in many men would be incorrectly estimated which could result in inappropriate recommen- dations for the need, timing, and frequency of PCa screening
如果使用這種GRS測(cè)試進(jìn)行風(fēng)險(xiǎn)評(píng)估,許多男性的前列腺癌風(fēng)險(xiǎn)將被錯(cuò)誤估計(jì)躺彬,這可能導(dǎo)致對(duì)前列腺癌篩查的需要煤墙、時(shí)間和頻率不適當(dāng)?shù)耐扑]
(使用基準(zhǔn)評(píng)估現(xiàn)有評(píng)估模型,發(fā)現(xiàn)差異)
METHODS
Study subjects
REDUCE試驗(yàn)是一項(xiàng)為期4年的隨機(jī)宪拥、雙盲仿野、安慰劑對(duì)照研究,評(píng)估達(dá)那雄胺降低PCa的安全性和有效性她君。
-
All participants had a negative prostate biopsy within 6 months of study enrollment and underwent protocol required biopsies at years 2 and 4, with additional biopsies when they were clinically indicated.
所有參與者在入組研究后6個(gè)月內(nèi)的前列腺活檢均為陰性脚作,并在第2年和第4年接受了方案要求的活檢,在臨床顯示時(shí)還進(jìn)行了額外的活檢。
-
Genotyping using the Illumina HumanOmni Express BeadChip was performed for Caucasian subjects who consented for genetic studies.
采用Illumina HumanOmni Express珠片對(duì)同意進(jìn)行基因研究的白種人進(jìn)行基因分型球涛。
Imputation was performed using the 1000 Genomes project.
利用1000個(gè)基因組計(jì)劃進(jìn)行了移植劣针。
-
The study was approved by the Wake Forest Institutional Review Board (000011435).
這項(xiàng)研究得到了維克森林機(jī)構(gòu)審查委員會(huì)(000011435)的批準(zhǔn)。
(研究對(duì)象病人的介紹)
SNP panels
-
We sought to compare the performance of the two analytical benchmarks for three PCa GRS panels;all on the basis of PCa risk‐ associated SNPs reported before July 1st, 2018.4,8,17–36 The first panel includes 115 PCa risk‐associated SNPs listed in the GWAS catalog that are available in the REDUCE study.
我們?cè)噲D兩種分析基準(zhǔn)比較PCa的GRSpanel的性能;所有這些都是基于2018.4年7月1日之前報(bào)道的PCa風(fēng)險(xiǎn)相關(guān)SNPs亿扁,第一個(gè)面板包括115個(gè)在GWAS目錄中列出的PCa風(fēng)險(xiǎn)相關(guān)SNPs捺典,可用于REDUCE研究。
-
The second panel, including 96 SNPs, is a subset of the first panel that met the GWAS significance level (P < 5×10?8).
第二組包括96個(gè)SNPs从祝,是第一個(gè)滿足GWAS顯著性水平(P < 5×10?8)的小組的一個(gè)子集襟己。
-
The third panel includes 110 SNPs that were curated from our evidence review of original papers that met the following criteria: (1) discovered from GWAS studies of Caucasian subjects, with at least 1000 cases and 1000 controls in the first stage;
第三個(gè)小組包括110個(gè)snp,這些snp來自我們對(duì)符合以下標(biāo)準(zhǔn)的原始論文的證據(jù)審查:(1)從GWAS對(duì)高加索受試者的研究中發(fā)現(xiàn)牍陌,第一階段至少有1000個(gè)病例和1000個(gè)對(duì)照組;
-
(2) confirmed in additional stages with combined P < 5×10?8;and (3) independent, linkage disequilibrium (LD) measurement (r2 < 0.2) between any pair of SNPs.
(2)附加分期證實(shí)擎浴,P < 5×10?8;(3)任意一對(duì)snp之間的獨(dú)立、連鎖不平衡(LD)測(cè)量(r2 < 0.2)呐赡。
-
Among the 110 SNPs included in Panel 3, 69 SNPs overlap with Panel 1, of which, 60 also overlap with Panel 2 (Figure 1)
在面板3中包含的110個(gè)snp中退客,69個(gè)snp與面板1重疊,其中60個(gè)snp與面板2重疊(圖1)
GRS calculation
- Populationstandardized GRS was computed using allelic ORs obtained from the external studies and allele frequencies in the gnomAD (NFE population).
-
Briefly, GRS was calculated by multiplying the per-allele OR for each SNP and normalizing the risk by the average risk expected in the population (w)
簡(jiǎn)單地說链嘀,GRS的計(jì)算方法是將每個(gè)等位基因或每個(gè)SNP的GRS相乘萌狂,并將風(fēng)險(xiǎn)正常化怀泊,乘以總體中預(yù)期的平均風(fēng)險(xiǎn)(w)
Assessment of baseline and calibration benchmarks
-
The baseline benchmark requires that the mean GRS in a general population approximates value of 1.0 and was calculated among men without a PCa diagnosis (in this study).
基線基準(zhǔn)要求一般人群的平均GRS值接近1.0茫藏,并且在沒有前列腺癌診斷的男性中計(jì)算(在本研究中)。
-
The calibration benchmark assesses the agreement between observed risks and GRS values and was assessed using a calibration plot, a calibration slope, and a bias score between observed risk and GRS values.
校準(zhǔn)基準(zhǔn)評(píng)估觀察到的風(fēng)險(xiǎn)與GRS值之間的一致性霹琼,并使用校準(zhǔn)圖务傲、校準(zhǔn)斜率和觀察到的風(fēng)險(xiǎn)與GRS值之間的偏差評(píng)分進(jìn)行評(píng)估。
-
For the calibration plot, subjects were grouped into seven bins on the basis of their GRS values (<0.3, 0.3‐0.79, 0.8‐1.19, 1.2‐1.49, 1.5‐1.99, 2‐2.99, and ≥3).
在校準(zhǔn)圖中枣申,根據(jù)受試者的GRS值(<0.3,0.3‐0.79,0.8‐1.19,1.2‐1.49,1.5‐1.99,2‐2.99售葡,≥3)將受試者分為7個(gè)組。
-
These bins of GRS values were chosen on the basis of three considerations: representation of broad spectrum of GRS values, the practical meaning of GRS values (for example, GRS values of 0.8‐1.19 as average risk), and possible cutoffs for defining risk category (for example, GRS values ≥3 for high risk).
這些箱子GRS的價(jià)值觀的基礎(chǔ)上選擇三個(gè)方面的考慮:廣泛的GRS的代表值,GRS價(jià)值觀的現(xiàn)實(shí)意義(例如,GRS值0.8還是1.19的平均風(fēng)險(xiǎn)),以及可能斷定義風(fēng)險(xiǎn)類別(例如,GRS值≥3高風(fēng)險(xiǎn))忠藤。
-
Similar results were obtained using other subgroups.
其他亞組也得到了類似的結(jié)果挟伙。
-
The observed risk in each subset of subjects was their OR for PCa compared with the subjects of the entire cohort and was plotted against the median GRS values of each group.
觀察到的每個(gè)受試者子集的風(fēng)險(xiǎn)是他們的 OR與整個(gè)隊(duì)列的受試者相比的PCa風(fēng)險(xiǎn),并與每個(gè)組的GRS中值作圖模孩。
-
Calibration slope was the regression coefficient on the basis of the seven data points.
標(biāo)定斜率為七個(gè)數(shù)據(jù)點(diǎn)的回歸系數(shù)尖阔。
-
Bias score was the average of the absolute difference between observed risks and GRS values at the seven data points.
偏倚評(píng)分是在7個(gè)數(shù)據(jù)點(diǎn)上觀察到的風(fēng)險(xiǎn)與GRS值之間的絕對(duì)差異的平均值。
RESULTS
-
Among the 3225 Caucasian subjects included in this study, 714 (22.1%) men were diagnosed with PCa during the 4‐year follow‐up.
在本研究納入的3225名白人受試者中榨咐,714名(22.1%)男性在4年隨訪期間被診斷為PCa介却。
-
The GRS values that were calculated from each of the three SNP panels were significantly associated with PCa risk, P = 5.58 × 10?3 P = 1×10?3, and P = 1.5 × 10?13 for Panels 1, 2, and 3, respectively.
三組SNP計(jì)算的GRS值均與PCa風(fēng)險(xiǎn)顯著相關(guān),分別為P = 5.58×10?3 P = 1×10?3和P = 1.5×10?13块茁。
-
The associations remained significant after adjusting for age, baseline serum PSA, and family history of PCa, P = 9.2 × 10?3, P = 1.7 × 10?3, and P = 5.48 × 10?14 for Panels 1, 2, and 3, respectively
在對(duì)年齡齿坷、基線血清PSA和前列腺癌家族史進(jìn)行調(diào)整后,相關(guān)性仍然顯著,分別為P = 9.2×10?3胃夏、P = 1.7×10?3和P = 5.48×10?14
-
The performance of the baseline benchmark is depicted in Figure 2.
基準(zhǔn)測(cè)試的性能如圖2所示轴或。
-
The mean GRS value (95% confidence interval [CI]) was 1.33 (1.12‐1.55), 1.09 (0.98‐1.20), and 0.98 (0.95‐1.01) for Panels 1, 2, and 3, respectively.
面板1、2和3的平均GRS值(95%置信區(qū)間[CI])分別為1.33(1.12‐1.55)仰禀、1.09(0.98‐1.20)和0.98(0.95‐1.01)照雁。
-
Panel 3 had the best performance for this benchmark and was the one with a mean GRS closest to 1.0.
在這個(gè)基準(zhǔn)測(cè)試中,Panel 3的性能最好答恶,它的平均GRS接近1.0饺蚊。
-
The performance of the calibration benchmark is depicted in Figure 3 where the observed risks were plotted for seven groups of men with GRS values of <0.3, 0.3‐0.79, 0.8‐1.19, 1.2‐1.49, 1.5‐1.99, 2‐2.99, and ≥3 (Table 1). The agreement was poor for GRS derived from Panels 1 (Figure 3A) and 2 (Figure 3B), and markedly improved for Panel 3 (Figure 3C).
校準(zhǔn)的性能基準(zhǔn)是描繪在圖3中,觀察到的風(fēng)險(xiǎn)被繪制了七組的男性與GRS的值<0.3, 0.3‐0.79, 0.8‐1.19, 1.2‐1.49, 1.5‐1.99, 2‐2.99, and ≥3 (Table 1),和GRSuz一致性低源自板1(圖3)和2(圖3 b),并顯著改善小組3(圖3 c)。
-
The calibration slope was 0.15, 0.12, and 0.60 for Panels 1, 2, and 3, respectively.
對(duì)于面板1悬嗓、2和3污呼,校準(zhǔn)斜率分別為0.15、0.12和0.60包竹。
-
The total bias score between the observed risks and GRS values was 0.08, 0.08, and 0.02 for Panels 1, 2, and 3, respectively.
觀察到的風(fēng)險(xiǎn)與GRS值之間的總偏差值分別為0.08燕酷、0.08和0.02(面板1、2和3)周瞎。
DISCUSSION
-
In this study, we propose a novel concept of narrow‐sense validity of reported GRS from tests, which differs from the broad‐sense validity of GRS concerning its overall statistical association with disease risk.
在本研究中苗缩,我們提出了一個(gè)新的概念,狹義有效性的報(bào)告GRS的測(cè)試声诸,這不同于廣義有效性的GRS有關(guān)其與疾病風(fēng)險(xiǎn)的總體統(tǒng)計(jì)關(guān)聯(lián)酱讶。
-
We also propose two benchmarks to objectively assess the validity of reported GRS values.
我們還提出了兩個(gè)基準(zhǔn)來客觀地評(píng)估報(bào)告的GRS值的有效性。
-
As a demonstration, we compared the bench- mark performance for three GRS tests (three different panels of PCa risk‐associated SNPs reported from GWAS) in a PCa chemopreven- tion trial.
作為一個(gè)演示彼乌,我們比較了PCa化療試驗(yàn)中三個(gè)GRS測(cè)試(三個(gè)不同的PCa風(fēng)險(xiǎn)相關(guān)SNPs面板泻肯,來自GWAS)的基準(zhǔn)性能。
-
We demonstrated that although all three SNP panels met the broad‐sense validity (statistically significant associations), these SNP panels differ considerably in the benchmark performance of reported GRS values.
我們證明慰照,盡管所有三個(gè)SNP面板都滿足廣義效度(統(tǒng)計(jì)上的顯著相關(guān)性)灶挟,但這些SNP面板在報(bào)告的GRS值的基準(zhǔn)性能上存在顯著差異。
-
Only the SNP panel that was on the basis of rigorous evidence‐based review (Panel 3) performed well for both baseline and calibration benchmarks
只有基于嚴(yán)格證據(jù)審查的SNP小組(小組3)在基線和校準(zhǔn)基準(zhǔn)方面表現(xiàn)良好
-
A fundamental feature of the narrow‐sense validity is its emphasis on the validity of reported GRS values from tests, rather than an overall statistical association of GRS from study populations (broad‐sense validity).
狹義有效度的一個(gè)基本特征是它強(qiáng)調(diào)來自測(cè)試的報(bào)告的GRS值的有效性毒租,而不是來自研究人群的總體GRS統(tǒng)計(jì)關(guān)聯(lián)(廣義效度)膏萧。
-
The validity of reported GRS values is essential for genetic testing because they are used directly for estimating individuals' disease risk, such as lifetime risk.
報(bào)告的GRS值的有效性對(duì)基因檢測(cè)至關(guān)重要,因?yàn)樗鼈冎苯佑糜诠烙?jì)個(gè)人的疾病風(fēng)險(xiǎn)蝌衔,如終生風(fēng)險(xiǎn)。
-
For example, if a GRS value of 2.1 was reported to a 61‐year old Caucasian man, we would interpret that the subject has 2.1‐fold increased risk for PCa compared with the general population and 31.1% remaining lifetime risk on the basis of his GRS values, current age, and race.
例如蝌蹂,如果一個(gè)61歲的白人男性的GRS值為2.1噩斟,根據(jù)他的GRS值、當(dāng)前年齡和種族孤个,我們將解釋該受試者患前列腺癌的風(fēng)險(xiǎn)是普通人群的2.1倍剃允,剩余壽命風(fēng)險(xiǎn)為31.1%。
-
The narrow‐sense validity proposed in the study addresses a practical and critical question in individual risk assessment, i.e.
研究中提出的狹義效度解決了個(gè)體風(fēng)險(xiǎn)評(píng)估中的一個(gè)實(shí)際和關(guān)鍵問題,即
-
, how do we know the scores we report to patients are reliable and valid?
斥废,我們?nèi)绾沃牢覀儓?bào)告給病人的分?jǐn)?shù)是可靠和有效的?
-
This question is difficult to address directly for individual test subjects because GRS is a likelihood measurement of a future event.
這個(gè)問題很難直接針對(duì)單個(gè)測(cè)試對(duì)象椒楣,因?yàn)镚RS是對(duì)未來事件的可能性度量。
-
However, it can be addressed indirectly using the calibration benchmark in existing study cohorts where disease status is known.
然而牡肉,在已知疾病狀況的現(xiàn)有研究群體中捧灰,可以使用校準(zhǔn)基準(zhǔn)間接地解決這一問題。
-
For example, we would have better confidence in the reported GRS values if groups of subjects with GRS of 2‐2.99 in an existing study cohort demonstrated similar observed risks (observed OR between 2 and 2.99).
例如统锤,如果現(xiàn)有研究隊(duì)列中GRS值為2‐2.99的受試者組顯示出類似的觀察風(fēng)險(xiǎn)(觀察到的或介于2至2.99之間)毛俏,我們將對(duì)報(bào)告的GRS值更有信心。
-
Our calibration benchmark differs from other commonly used calibration methodologies in several important aspects.
我們的校準(zhǔn)基準(zhǔn)與其他常用的校準(zhǔn)方法在幾個(gè)重要方面有所不同饲窿。
-
First, most calibration methods rely on the agreement between an observed probability (Y‐axis) and a predicted probability or risk (X‐axis), or between an observed OR (Y‐axis) and a predicted OR (X‐axis).
首先煌寇,大多數(shù)校準(zhǔn)方法依賴于觀測(cè)概率(Y‐軸)與預(yù)測(cè)概率或風(fēng)險(xiǎn)(X‐軸)之間的一致性,或觀測(cè)或(Y‐軸)與預(yù)測(cè)或(X‐軸)之間的一致性逾雄。
Both predicted probability and predicted OR in these calibration methods are not original GRS values reported to test subjects but derived from a regression model of GRS from all subjects in a study population.
無論是預(yù)測(cè)概率還是預(yù)測(cè)或在這些校準(zhǔn)方法中阀溶,并不是原始的GRS值報(bào)告給測(cè)試對(duì)象,而是從研究人群中所有受試者的GRS回歸模型中得出的鸦泳。
-
In contrast, the X‐axis of our calibration method is the reported GRS values of test subjects and does not rely on any regression model in a cohort.
相比之下银锻,我們校準(zhǔn)方法的X軸是測(cè)試對(duì)象的報(bào)告GRS值,不依賴于隊(duì)列中的任何回歸模型辽故。
-
Second, most calibration plots typically stratify subjects into deciles of predicted probability/OR derived from regression models.
其次徒仓,大多數(shù)校準(zhǔn)圖通常將受試者分層為預(yù)測(cè)概率的十分位數(shù)/或從回歸模型導(dǎo)出的十分位數(shù)。
-
These deciles are relative to other subjects in a cohort and are less meaningful because they are not directly used to estimate individual risks of test subjects.
這些十分位數(shù)相對(duì)于隊(duì)列中的其他受試者而言意義不大誊垢,因?yàn)樗鼈儧]有直接用于評(píng)估受試者的個(gè)體風(fēng)險(xiǎn)掉弛。
-
In contrast, the binning of subjects in our calibration benchmark is on the basis of reported GRS values that are directly used to estimate individual risks of test subjects, therefore, practically meaningful.
相比之下,我們校準(zhǔn)基準(zhǔn)中受試者的binning是基于已報(bào)告的GRS值喂走,GRS值直接用于評(píng)估受試者的個(gè)體風(fēng)險(xiǎn)殃饿,因此具有實(shí)際意義。
-
Finally, most calibration methods rely on the calibration slope (correlation) alone.
最后芋肠,大多數(shù)校準(zhǔn)方法僅依賴于校準(zhǔn)斜率(相關(guān)性)乎芳。
-
38,39 In comparison, we use both the calibration slope and the bias score to assess calibration.
相比較,我們使用校準(zhǔn)斜率和偏差評(píng)分來評(píng)估校準(zhǔn)帖池。
-
Correlation and agreement are two important but different measurements of calibration and the latter provides additional critical information that is not captured by correlation alone.
相關(guān)性和一致性是兩種重要但不同的校準(zhǔn)測(cè)量方法奈惑,后者提供了額外的關(guān)鍵信息,這些信息不能單獨(dú)由相關(guān)性捕獲睡汹。
-
A perfect calibration slope (β = 1.0) does not necessarily imply a good bias score.
一個(gè)完美的校準(zhǔn)斜率(? = 1.0)并不一定意味著一個(gè)好的傾向得分肴甸。
-
For example, a large but symmetric difference between observed and expected risks can have good calibration slope but a poor bias score.
例如,在觀察到的風(fēng)險(xiǎn)和預(yù)期風(fēng)險(xiǎn)之間存在較大但對(duì)稱的差異時(shí)囚巴,可以有較好的校準(zhǔn)斜率原在,但偏差評(píng)分較差友扰。
-
However, a perfect bias score of 0 always implies a perfect calibration slope.
然而,一個(gè)完美的偏差值為0總是意味著一個(gè)完美的校準(zhǔn)斜率庶柿。
-
Noticeably, the calibration of the slope was considerably smaller than 1 in all three SNP panels (β = 0.15, 0.12, and 0.60 for Panel 1, 2, and 3, respectively) revealing overestimated risk for subjects with GRS values >1.
明顯,斜率遠(yuǎn)遠(yuǎn)小于1的校準(zhǔn)在所有三個(gè)SNP面板(β= 0.15,0.12,和0.60板1村怪、2和3,分別)揭示高估風(fēng)險(xiǎn)科目GRS值> 1。
-
Even in the best test (Panel 3) for example, the observed risk was only 1.6 for subjects with GRS values between 2‐2.99 (median of 2.3).
例如浮庐,即使在最佳測(cè)試(面板3)中甚负,GRS值在2‐2.99(中值2.3)之間的受試者,觀察到的風(fēng)險(xiǎn)僅為1.6兔辅。
-
The smaller slope is likely, in part, because of overestimated OR of the individual SNPs from the external data (“winner's curse”).
斜率變小的部分原因可能是高估了或來自外部數(shù)據(jù)的單個(gè)snp(“贏家的詛咒”)腊敲。
-
These results highlight the informative nature of the calibration benchmark and the need for further adjustment of GRS values.
這些結(jié)果強(qiáng)調(diào)了校準(zhǔn)基準(zhǔn)的信息性,以及進(jìn)一步調(diào)整GRS值的必要性维苔。
-
One approach is to adjust the OR estimate of individual SNPs to reduce the effect of the “winner's curse”.
一種方法是調(diào)整單個(gè)snp的估計(jì)值碰辅,以減少“贏家詛咒”的影響。
-
For example, if we apply a 10% correction of reported OR for each SNP in Panel 3, the calibration slope increased from 0.60 to 1.00 and the bias score decreased from 0.03 to ~0.00.
例如介时,如果我們對(duì)報(bào)告或面板3中的每個(gè)SNP應(yīng)用10%的校正没宾,校正斜率從0.60增加到1.00,偏差評(píng)分從0.03下降到~0.00沸柔。
-
Another approach is to perform a regression analysis to systematically reduce the bias from all potential sources.
另一種方法是進(jìn)行回歸分析循衰,系統(tǒng)地減少來自所有潛在來源的偏差。
-
However, such adjustment may only be applicable to a specific study population and requires further validation in independent populations for broad applications.
但是褐澎,這種調(diào)整可能只適用于特定的研究群體会钝,需要在獨(dú)立群體中進(jìn)行進(jìn)一步驗(yàn)證,以便廣泛應(yīng)用工三。
-
Similar results were found when subjects were stratified into 5,10, and 20 equally distributed groups on the basis of their GRS values.
根據(jù)受試者的GRS值將其分為5個(gè)迁酸、10個(gè)和20個(gè)均勻分布的組,也得到了類似的結(jié)果俭正。
-
For example, on the basis of the 10 equally distributed groups, the calibration slope was 0.13, 0.16, and 0.60 for Panels 1, 2, and 3, respectively, and the bias score was 0.08, 0.09, and 0.02 for Panels 1, 2, and 3, respectively.
例如奸鬓,在10組均勻分布的基礎(chǔ)上,面板1掸读、2串远、3的校準(zhǔn)斜率分別為0.13、0.16儿惫、0.60澡罚,面板1、2肾请、3的偏差值分別為0.08始苇、0.09催式、0.02避归。
-
Panel 3 performed the best for both calibration slope and bias score.
面板3在校正斜率和偏置評(píng)分方面表現(xiàn)最佳荣月。
-
Several factors can influence the narrow‐sense validity of GRS.
影響GRS狹義效度的因素有很多梳毙。
-
The poor performance of the baseline benchmark (considerably deviated from 1.0) indicates inaccurate estimates of allele frequency used in GRS calculation and/or LD between SNPs.
基線基準(zhǔn)的性能較差(與1.0相差很大)表明,在計(jì)算GRS和/或SNPs之間的LD時(shí)账锹,對(duì)等位基因頻率的估計(jì)不準(zhǔn)確萌业。
-
The calibration benchmark can be affected by multiple factors, including (1) whether all SNPs are truly risk‐associated, (2) accuracy of estimates for OR and allele frequency, (3) assumption of the additive effect of risk‐ associated SNPs, and (4) independence of SNPs
校準(zhǔn)基準(zhǔn)可以受到多種因素的影響奸柬,包括(1)是否所有snp都是真正的風(fēng)險(xiǎn)相關(guān)的廓奕,(2)OR和等位基因頻率估計(jì)的準(zhǔn)確性桌粉,(3)風(fēng)險(xiǎn)相關(guān)snp的附加效應(yīng)的假設(shè),以及(4)snp的獨(dú)立性
-
There are several limitations to this study.
這項(xiàng)研究有幾個(gè)局限性患亿。
-
First, the assessment of the benchmark performance was only assessed in a single study cohort.
首先步藕,對(duì)基準(zhǔn)績(jī)效的評(píng)估僅在單個(gè)研究隊(duì)列中進(jìn)行漱抓。
-
The performance could be different in other study cohorts with different characteristics.
在其他具有不同特征的研究群體中恕齐,其表現(xiàn)可能會(huì)有所不同显歧。
-
Because of the specific inclusion criteria of the REDUCE study (men who had initial negative prostate biopsy), the present results may not be applicable to men in the general population.
由于REDUCE研究的具體納入標(biāo)準(zhǔn)(初始前列腺活檢陰性的男性)士骤,目前的結(jié)果可能不適用于一般人群中的男性拷肌。
-
Benchmark performance in multiple study cohorts that represent the general population is preferred.
在代表一般人群的多個(gè)研究群組中旨巷,基準(zhǔn)性能是首選采呐。
-
Second, the relatively small sample size in this REDUCE cohort limits our ability to stratify subjects into groups with a more narrow range of GRS values.
其次斧吐,這個(gè)REDUCE隊(duì)列中相對(duì)較小的樣本量限制了我們將受試者劃分為GRS值范圍更窄的組的能力煤率。
-
Ideally, observed risk should be estimated for subjects at one‐tenth of GRS unit because GRS is typically reported at such resolution.
理想情況下蝶糯,觀察到的風(fēng)險(xiǎn)應(yīng)估算為GRS單位的1‐十分之一煤辨,因?yàn)镚RS通常以這種分辨率報(bào)告众辨。
-
Finally, it is recognized that the current study is a retrospective analysis of prospective studies.
最后鹃彻,我們認(rèn)識(shí)到蛛株,目前的研究是對(duì)前瞻性研究的回顧性分析。
-
However, we feel it is a valid approach considering that many typical biases in retrospective studies are unlikely in this study.
然而欢摄,我們認(rèn)為這是一個(gè)有效的方法怀挠,考慮到許多典型的偏見在回顧性研究中不太可能在本研究绿淋。
-
The self‐reporting bias is not applicable because GRS is an objective measurement and because of its prospective study design.
自我報(bào)告偏差是不適用的吞滞,因?yàn)镚RS是一個(gè)客觀的測(cè)量裁赠,因?yàn)樗那罢靶匝芯吭O(shè)計(jì)。
-
The observers' bias is minimized because GRS is unknown to test subjects and investigators (practically blinded).
觀察者的偏見被最小化了凸舵,因?yàn)镚RS對(duì)測(cè)試對(duì)象和調(diào)查者來說是未知的(實(shí)際上是盲的)。
-
Furthermore, as a germline marker, GRS always precedes any phenotypes and therefore avoids the temporal ambiguity.
此外掀潮,作為一個(gè)種系標(biāo)記仪吧,GRS總是先于任何表型薯鼠,因此避免了時(shí)間上的模糊性出皇。
-
Nevertheless, GRS may be susceptible to other biases, such as competing risk bias and selective survival bias.
然而郊艘,GRS可能容易受到其他偏見的影響唯咬,如競(jìng)爭(zhēng)風(fēng)險(xiǎn)偏見和選擇性生存偏見胆胰。
-
Risk assessment is essential for developing personalized prevention and intervention strategies for individuals.
風(fēng)險(xiǎn)評(píng)估對(duì)于為個(gè)人制定個(gè)性化的預(yù)防和干預(yù)策略至關(guān)重要蜀涨。
-
The potential benefit of personalized strategies relies on the validity of GRS.
個(gè)性化策略的潛在效益依賴于GRS的有效性勉盅。
-
Misclassifica- tion of risk from unreliable risk score, polygenic or otherwise, may lead to inappropriate and possibly harmful actions.
將風(fēng)險(xiǎn)從不可靠的風(fēng)險(xiǎn)評(píng)分(多基因或其他)中錯(cuò)誤分類草娜,可能導(dǎo)致不適當(dāng)?shù)暮涂赡苡泻Φ男袨椤?/p>
-
Results from this study demonstrate the feasibility and importance of the benchmarks for assessing the validity of reported GRS values from tests.
本研究的結(jié)果顯示了評(píng)估測(cè)試報(bào)告的GRS值有效性的基準(zhǔn)的可行性和重要性宰闰。
-
Not all GRS tests met the benchmarks, thus not all reported GRS values would be expected to perform reliably in the clinical setting.
并不是所有的GRS測(cè)試都符合標(biāo)準(zhǔn),因此并不是所有報(bào)告的GRS值都能在臨床環(huán)境中可靠地執(zhí)行老充。
-
For example, if Panel 1 was used for risk assessment, 4.2% of men in the REDUCE study would receive high GRS results (GRS = 2‐2.99).
例如啡浊,如果使用面板1進(jìn)行風(fēng)險(xiǎn)評(píng)估巷嚣,那么REDUCE研究中4.2%的男性將獲得高GRS結(jié)果(GRS = 2‐2.99)廷粒。
-
The observed risk in these men, however, was only 0.92.
然而红且,在這些男性中觀察到的風(fēng)險(xiǎn)只有0.92暇番。
-
These “l(fā)ow‐risk” men could take unnecessary recommendations for an early and higher frequency of PCa screening.
這些“低風(fēng)險(xiǎn)”的男性可能會(huì)接受不必要的建議奔誓,以便更早厨喂、更頻繁地進(jìn)行前列腺癌篩查蜕煌。
-
We recommend that all GRS tests intended for clinical use or being used already in the clinic be evaluated using the two benchmarks before being implemented for individual risk assessment
我們建議斜纪,所有擬用于臨床或已在臨床使用的GRS測(cè)試盒刚,在實(shí)施個(gè)體風(fēng)險(xiǎn)評(píng)估之前,應(yīng)使用這兩個(gè)基準(zhǔn)進(jìn)行評(píng)估
-
The concept of narrow‐sense validity and proposed benchmarks are highly relevant and timely considering that PRS are currently available from commercial providers and being clinically evaluated by academia.
狹義效度的概念和建議的基準(zhǔn)是高度相關(guān)和及時(shí)的考慮到生產(chǎn)者目前可從商業(yè)供應(yīng)商和臨床評(píng)估學(xué)術(shù)界。
-
Furthermore, numerous consortia and academic institu- tions are contemplating clinical studies using these scores.
此外趾断,許多協(xié)會(huì)和學(xué)術(shù)機(jī)構(gòu)正在考慮使用這些分?jǐn)?shù)進(jìn)行臨床研究芋酌。
-
Having an objective assessment for the validity of reported GRS values will have a positive impact on the development and successful translation of PRS to the clinic in areas including cardiovascular disease, oncology, obesity, neurology, and diabetes
對(duì)所報(bào)道的GRS值的有效性進(jìn)行客觀評(píng)估脐帝,將對(duì)心血管疾病、腫瘤學(xué)梢杭、肥胖癥、神經(jīng)學(xué)和糖尿病等領(lǐng)域的PRS的開發(fā)和成功轉(zhuǎn)化產(chǎn)生積極影響
-
Finally, it is important to note that statistical methods for assessing the benchmarks described in the study are in the early stages of development.
最后荡含,必須指出释液,評(píng)估本研究中所述基準(zhǔn)的統(tǒng)計(jì)方法尚處于發(fā)展的初期階段误债。
-
The concept for assessing the narrow‐sense validity and proposed benchmarks originated from practical experience in translat- ing GRS into the clinic.
評(píng)估狹義效度和建議基準(zhǔn)的概念源于將GRS轉(zhuǎn)化為臨床的實(shí)踐經(jīng)驗(yàn)寝蹈。
-
They are meant to serve as a stepping‐stone for initiating this important discussion.
他們的目的是作為啟動(dòng)這一重要討論的墊腳石箫老。
-
We believe a multidisciplinary collaboration among researchers from risk modeling, epidemiology, and genomic translational research will further improve the metho- dology for assessing the narrow‐sense validity of GRS.
我們相信耍鬓,來自風(fēng)險(xiǎn)模型牲蜀、流行病學(xué)和基因組翻譯研究的多學(xué)科合作將進(jìn)一步改進(jìn)評(píng)估GRS狹義有效性的方法涣达。