在profile 模式下產(chǎn)生的質(zhì)譜數(shù)據(jù)虹菲,特定離子的信號通常分布在離子真實(shí)m/z值周圍弦讽。這種信號的準(zhǔn)確性依賴于儀器的分辨率和設(shè)置喇肋。profile模式數(shù)據(jù)可以處理成centroid數(shù)據(jù)鸣戴,只保留一個單一的、有代表性的值昧廷,通常是數(shù)據(jù)點(diǎn)分布的局部最大值。某些算法偎箫,如LC-MS實(shí)驗(yàn)xcms包中用于色譜峰檢測的centWave函數(shù)或蛋白質(zhì)組學(xué)匹配MS2光譜和多肽的搜索引擎木柬,要求數(shù)據(jù)為centroid模式。
可以使用MSconvert在將數(shù)據(jù)轉(zhuǎn)化為centroid模式
不同版本的MSconvert轉(zhuǎn)換結(jié)果會略有差異淹办。
但是MSconvert軟件轉(zhuǎn)換往往存在耗時特別長眉枕,轉(zhuǎn)換不成功等問題。此外怜森,也可以通過MSnbase包的pickPeaks函數(shù)實(shí)現(xiàn)轉(zhuǎn)換速挑,該方法對單個光譜(Spectrum實(shí)例)或整個實(shí)驗(yàn)(MSnExp實(shí)例)進(jìn)行峰挑選,以創(chuàng)建中心光譜副硅。
質(zhì)譜的centroid 模式會使得檢出來的二級質(zhì)譜更多姥宝。
library(xcms)
library(magrittr)
1 載入數(shù)據(jù)
data_raw <- readMSData("pos_20211-fa-51.mzML", mode = "onDisk")
判斷數(shù)據(jù)是否已是centroid模式
dda_data@featureData@data$centroided
dda_data@featureData@data$smoothed
#[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#[8] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
2 轉(zhuǎn)化為centroid模式
主要通過pickPeaks函數(shù)完成。
參數(shù)refineMz有"kNeighbors","descendPeak","none"(默認(rèn))3個選項(xiàng)恐疲。
kNeighbors通過加權(quán)平均計(jì)算最接近真實(shí)m/z腊满;
descendPeak峰值區(qū)域通過從兩側(cè)確定的質(zhì)心/峰值下降套么,直到測量信號再次增加來定義。在該定義區(qū)域內(nèi)碳蛋,強(qiáng)度至少為質(zhì)心強(qiáng)度百分比的所有測量值用于計(jì)算精確的m/z胚泌。
2.1 分別使用3種方式
data_cent <- data_raw %>%
pickPeaks(refineMz = "descendPeak")
data_sc<- data_raw %>%
smooth(method = "SavitzkyGolay", halfWindowSize = 4L) %>%
pickPeaks(refineMz = "descendPeak")
data_cs<- data_raw %>%
pickPeaks(refineMz = "descendPeak") %>%
smooth(method = "SavitzkyGolay", halfWindowSize = 4L)
3 提取XIC圖
設(shè)定保留時間和質(zhì)荷比的范圍
##尿素
#rtr <- c(250, 290)
rtr <- c(570, 590)
#mzr <- c(60.5, 61.5)
mzr <- c(118.0, 118.2)
可視化
data_raw |>
filterRt(rt = rtr) |>
filterMz(mz = mzr) |>
plot(type = "XIC")
data_cent %>%
filterRt(rt = rtr) %>%
filterMz(mz = mzr) %>%
plot(type = "XIC")
data_sc %>%
filterRt(rt = rtr) %>%
filterMz(mz = mzr) %>%
plot(type = "XIC")
data_cs %>%
filterRt(rt = rtr) %>%
filterMz(mz = mzr) %>%
plot(type = "XIC")
發(fā)現(xiàn)centroid化后,背景噪音大大減少肃弟。
4 檢測峰(features)
cwp <- CentWaveParam(snthresh = 5, noise = 100, ppm = 14,
peakwidth = c(1, 30))
peak1 <- findChromPeaks(data_raw, param = cwp)
#Detecting mass traces at 14 ppm ... OK
#Detecting chromatographic peaks in 13498 regions of interest ... OK: 4124 found.
peak2 <- findChromPeaks(data_cent, param = cwp)
#Detecting mass traces at 14 ppm ... OK
#Detecting chromatographic peaks in 1996 regions of interest ... OK: 298 found.
peak3 <- findChromPeaks(data_sc, param = cwp)
#Detecting mass traces at 14 ppm ... OK
#Detecting chromatographic peaks in 1412 regions of interest ... OK: 364 found.
peak4 <- findChromPeaks(data_cs, param = cwp)
#Detecting mass traces at 14 ppm ... OK
#Detecting chromatographic peaks in 1828 regions of interest ... OK: 202 found.
可見centroid化后玷室,檢測出的雜峰大大減少,減少了91%笤受。
比較對一級譜圖的影響
par(mar=c(6,3,6,3))
plot(data_raw[[3737]],data_cent[[3737]])
比較對二級譜圖的影響
plot(data_raw[[3739]],data_cent[[3739]])
轉(zhuǎn)化為centroid模式后阵苇,無論是一級質(zhì)譜還是二級質(zhì)譜,雜峰明顯減少感论。centroid化绅项,其實(shí)是將質(zhì)譜數(shù)據(jù)“減肥”的過程。轉(zhuǎn)化后的數(shù)據(jù)可以用于后續(xù)分析比肄,也可以保存快耿。
writeMSData(dda_data, file = "dda_data.mzML")
如果不知道質(zhì)譜數(shù)據(jù)是否為centroid模式可以通過featureData@data[["centroided"]]查看。
多個樣本
data_cent <- data_raw %>%
pickPeaks(refineMz = "descendPeak")
或
data_cent <- data_raw %>%
smooth(method = "MovingAverage", halfWindowSize = 2L) %>%
pickPeaks(refineMz = "descendPeak")
參考資料:
Bioconductor - MSnbase
MSnbase: centroiding of profile-mode MS data (bioconductor.org)
MSnbase: MS data processing, visualisation and quantification ? MSnbase (lgatto.github.io)