簡介
??目前麦锯,單細胞技術(shù)發(fā)展迅速恕稠,單細胞轉(zhuǎn)錄組測序可以幫助我們獲得組織中每個細胞的基因表達水平,但卻無法獲得對應(yīng)空間位置上的信息扶欣《煳。空間轉(zhuǎn)錄組技術(shù)(spatial transcriptomics technologies)使我們能夠利用空間定位信息對組織位置進行基因表達譜分析千扶,從而表征空間組織結(jié)構(gòu),更好的幫助我們?nèi)ダ斫饨M織異質(zhì)性骆捧。然而澎羞,基于高通量測序的空轉(zhuǎn)技術(shù)還達不到單細胞轉(zhuǎn)錄組的分辨率,測得的每個孔(spot)可能包含10個以上的細胞敛苇,這會導(dǎo)致我們收集的特定位置的基因表達是同質(zhì)或者異質(zhì)細胞類型的混合狀態(tài)妆绞。因此,需要使用細胞類型去卷積(deconvolution)的分析工具去解析每個孔中的細胞類型組成枫攀,從而實現(xiàn)對細胞類型的空間定位±ㄈ模現(xiàn)在此類分析工具出來很多,像SPOTlight来涨、RCTD图焰、Cell2location、STdeconvolve蹦掐。也有文章對此類工具進行了測試比較技羔,感興趣的可以看看這篇文章:A comprehensive comparison on cell-type composition inference for spatial transcriptomics data,今天我們給大家介紹一款發(fā)表在Nature Biotechnology雜志上的軟件——CARD卧抗,它是由美國密歇根大學藤滥、生物統(tǒng)計系周翔課題組開發(fā)的,是一種對空間轉(zhuǎn)錄組數(shù)據(jù)實現(xiàn)細胞類型去卷積分析的方法社裆。
軟件介紹
??CARD設(shè)計用于去卷積空間轉(zhuǎn)錄組數(shù)據(jù)超陆,并基于參考scRNA-seq數(shù)據(jù)推斷每個空間位置上的細胞類型組成,如下圖所示浦马。CARD需要帶有細胞類型特異性基因表達信息的scRNA-seq數(shù)據(jù)(左圖)以及帶有定位信息的空間轉(zhuǎn)錄組學數(shù)據(jù)(右圖)时呀。有了這兩個數(shù)據(jù)輸入,CARD通過一個非負矩陣執(zhí)行去卷積分解框架晶默,并輸出跨空間位置的估計細胞類型組成谨娜。CARD的一個獨特之處在于,它能夠通過CAR模型解釋跨空間位置的細胞類型組成的空間相關(guān)性磺陡。通過計算空間位置上細胞類型組成的空間相關(guān)性趴梢,CARD還能夠在原始研究中沒有測量到的位置上輸入細胞類型組成和基因表達水平,從而促進在組織上構(gòu)建精細的高分辨率空間地圖币他。此外坞靶,如果相應(yīng)單細胞轉(zhuǎn)錄組數(shù)據(jù)缺失,無法構(gòu)建參考矩陣時蝴悉,我們可以輸入marker基因彰阴,CARD 可以進行無細胞類型特異性參考矩陣的去卷積分析。
軟件實操
??下面我們使用代碼來進行CARD軟件分析拍冠,具體示例代碼見CARD分析流程尿这。
數(shù)據(jù)準備
??我們首先需要準備兩個數(shù)據(jù)簇抵,一個是單細胞轉(zhuǎn)錄組數(shù)據(jù)(原始counts表達矩陣+每個細胞的注釋矩陣),還有一個空間轉(zhuǎn)錄組數(shù)據(jù)(原始counts表達矩陣+每個spot的空間位置矩陣)射众。示例數(shù)據(jù)獲取方式:
wget https://github.com/YingMa0107/CARD/raw/master/data/spatial_location.RData ###空轉(zhuǎn)的空間位置矩陣
wget https://github.com/YingMa0107/CARD/raw/master/data/spatial_count.RData ###空轉(zhuǎn)的counts表達矩陣
wget https://github.com/YingMa0107/CARD/raw/master/data/sc_meta.RData ###單細胞轉(zhuǎn)錄組的細胞注釋矩陣
wget https://github.com/YingMa0107/CARD/raw/master/data/sc_count.RData ###細胞轉(zhuǎn)錄組的counts表達矩陣
??接下來我們安裝和加載CARD包碟摆,導(dǎo)入所需數(shù)據(jù):
### 安裝CARD包
devtools::install_github('YingMa0107/CARD')
###加載CARD包
library(CARD)
###導(dǎo)入數(shù)據(jù)
load("./spatial_count.RData")###導(dǎo)入空轉(zhuǎn)的counts表達矩陣
spatial_count[1:4,1:4]
4 x 4 sparse Matrix of class "dgCMatrix"
10x10 10x13 10x14 10x15
X5S_rRNA . . . .
X5_8S_rRNA . . . .
X7SK . . . .
A1BG.AS1 . . . .
load("./spatial_location.RData")###導(dǎo)入空轉(zhuǎn)的空間位置矩陣
spatial_location[1:4,]
x y
10x10 10 10
10x13 10 13
10x14 10 14
10x15 10 15
load("./sc_count.RData")
sc_count[1:4,1:4]###導(dǎo)入單細胞轉(zhuǎn)錄組的counts表達矩陣
4 x 4 sparse Matrix of class "dgCMatrix"
Cell1 Cell2 Cell3 Cell4
A1BG . . . .
A1CF . . . 1
A2M . . . .
A2ML1 . . . .
load("./sc_meta.RData")###導(dǎo)入單細胞轉(zhuǎn)錄組的細胞注釋矩陣
sc_meta[1:4,]
cellID cellType sampleInfo
Cell1 Cell1 Acinar_cells sample1
Cell2 Cell2 Ductal_terminal_ductal_like sample1
Cell3 Cell3 Ductal_terminal_ductal_like sample1
Cell4 Cell4 Ductal_CRISP3_high-centroacinar_like sample1
##創(chuàng)建對象
??然后我們用導(dǎo)入的數(shù)據(jù)創(chuàng)建CARD對象,創(chuàng)建完成后空轉(zhuǎn)數(shù)據(jù)儲存在CARD_obj@spatial_countMat和CARD_obj@spatial_location叨橱,單細胞轉(zhuǎn)錄組數(shù)據(jù)儲存在CARD_obj@sc_eset中:
CARD_obj = createCARDObject(
sc_count = sc_count,
sc_meta = sc_meta,
spatial_count = spatial_count,
spatial_location = spatial_location,
ct.varname = "cellType",
ct.select = unique(sc_meta$cellType),
sample.varname = "sampleInfo",
minCountGene = 100,
minCountSpot = 5)
## QC on scRNASeq dataset! ...
## QC on spatially-resolved dataset! ..
##去卷積分析
??現(xiàn)在我們把所有東西都存儲在CARD對象中典蜕,我們可以使用CARD去卷積空間轉(zhuǎn)錄組數(shù)據(jù),完成后結(jié)果儲存在CARD_obj@Proportion_CARD中罗洗,獲得每個spot中各種細胞類型的組成占比愉舔。
CARD_obj = CARD_deconvolution(CARD_object = CARD_obj)
## create reference matrix from scRNASeq...
## Select Informative Genes! ...
## Deconvolution Starts! ...
## Deconvolution Finish! ...
print(CARD_obj@Proportion_CARD[1:2,])
Acinar_cells Ductal_terminal_ductal_like
10x10 6.370860e-02 0.02391251
10x13 7.818278e-08 0.02996182
Ductal_CRISP3_high-centroacinar_like Cancer_clone_A Ductal_MHC_Class_II
10x10 0.1801753 4.229928e-04 0.021557706
10x13 0.9620440 1.910394e-07 0.006437371
Cancer_clone_B mDCs_A Ductal_APOL1_high-hypoxic Tuft_cells
10x10 4.339480e-05 0.011136707 4.967235e-04 2.025089e-03
10x13 2.319262e-05 0.001013068 3.864917e-06 1.796433e-06
mDCs_B pDCs Endocrine_cells Endothelial_cells Macrophages_A
10x10 0.0792525811 7.432979e-07 6.848627e-03 1.722855e-01 9.909662e-02
10x13 0.0004695018 1.566377e-11 3.925412e-11 4.198468e-11 2.766705e-05
Mast_cells Macrophages_B T_cells_&_NK_cells Monocytes RBCs
10x10 7.411147e-11 3.090675e-02 4.976256e-06 2.663846e-06 3.84187e-10
10x13 2.387132e-11 9.499908e-06 1.172387e-11 2.000116e-06 1.00967e-06
Fibroblasts
10x10 3.081225e-01
10x13 4.898874e-06
可視化
??CARD軟件提供兩種比較好的可視化函數(shù),CARD.visualize.pie和CARD.visualize.prop栖博,前者是在空轉(zhuǎn)位置上繪制每個spot的細胞類型組成拼圖屑宠;后者是在空轉(zhuǎn)位置上繪制每種細胞類型在所有spot中的占比情況厢洞,用顏色代表占比的高低仇让。
colors = c("#FFD92F","#4DAF4A","#FCCDE5","#D9D9D9","#377EB8","#7FC97F","#BEAED4",
"#FDC086","#FFFF99","#386CB0","#F0027F","#BF5B17","#666666","#1B9E77","#D95F02",
"#7570B3","#E7298A","#66A61E","#E6AB02","#A6761D")#可以自定義顏色
p1 <- CARD.visualize.pie(proportion = CARD_obj@Proportion_CARD,spatial_location = CARD_obj@spatial_location, colors = colors)
print(p1)
??這個函數(shù)不能設(shè)置每個spot餅圖的大小,如果想自己調(diào)整躺翻,可以手動更改函數(shù)內(nèi)部再繪圖丧叽,如下所示:
CARD.visualize.pie2 <- function(proportion,spatial_location,colors = NULL,round_size){
res_CARD = as.data.frame(proportion)
res_CARD = res_CARD[,mixedsort(colnames(res_CARD))]
location = as.data.frame(spatial_location)
if(sum(rownames(res_CARD)==rownames(location))!= nrow(res_CARD)){
stop("The rownames of proportion data does not match with the rownames of spatial location data")
}
colorCandidate = c("#1e77b4","#ff7d0b","#ceaaa3","#2c9f2c","#babc22","#d52828","#9267bc",
"#8b544c","#e277c1","#d42728","#adc6e8","#97df89","#fe9795","#4381bd","#f2941f","#5aa43a","#cc4d2e","#9f83c8","#91675a",
"#da8ec8","#929292","#c3c237","#b4e0ea","#bacceb","#f7c685",
"#dcf0d0","#f4a99f","#c8bad8",
"#F56867", "#FEB915", "#C798EE", "#59BE86", "#7495D3",
"#D1D1D1", "#6D1A9C", "#15821E", "#3A84E6", "#997273",
"#787878", "#DB4C6C", "#9E7A7A", "#554236", "#AF5F3C",
"#93796C", "#F9BD3F", "#DAB370", "#877F6C", "#268785")
if(is.null(colors)){
#colors = brewer.pal(11, "Spectral")
if(ncol(res_CARD) > length(colorCandidate)){
colors = colorRampPalette(colorCandidate)(ncol(res_CARD))
}else{
colors = colorCandidate[sample(1:length(colorCandidate),ncol(res_CARD))]
}
}else{
colors = colors
}
data = cbind(res_CARD,location)
ct.select = colnames(res_CARD)
p = suppressMessages(ggplot() + geom_scatterpie(aes(x=x, y=y,r = round_size),data=data, ###r默認是0.52
cols=ct.select,color=NA) + coord_fixed(ratio = 1) +
scale_fill_manual(values = colors)+
theme(plot.margin = margin(0.1, 0.1, 0.1, 0.1, "cm"),
panel.background = element_blank(),
plot.background = element_blank(),
panel.border = element_rect(colour = "grey89", fill=NA, size=0.5),
axis.text =element_blank(),
axis.ticks =element_blank(),
axis.title =element_blank(),
legend.title=element_text(size = 16,face="bold"),
legend.text=element_text(size = 15),
legend.key = element_rect(colour = "transparent", fill = "white"),
legend.key.size = unit(0.45, 'cm'),
strip.text = element_text(size = 16,face="bold"),
legend.position="bottom")+
guides(fill=guide_legend(title="Cell Type")))
return(p)
}
p1 <- CARD.visualize.pie2(proportion = CARD_obj@Proportion_CARD,spatial_location = CARD_obj@spatial_location, colors = colors, round_size=5)
??我們同時可以選擇一些感興趣的細胞類型分別進行可視化,查看細胞類型比例的空間分布:
ct.visualize = c("Acinar_cells","Cancer_clone_A","Cancer_clone_B","Ductal_terminal_ductal_like","Ductal_CRISP3_high-centroacinar_like","Ductal_MHC_Class_II","Ductal_APOL1_high-hypoxic","Fibroblasts")
p2 <- CARD.visualize.prop(
proportion = CARD_obj@Proportion_CARD,
spatial_location = CARD_obj@spatial_location,
ct.visualize = ct.visualize, ### 選擇要可視化的細胞類型
colors = c("lightblue","lightyellow","red"), ###可以手動設(shè)置顏色
NumCols = 4) ###圖中展示的細胞類型的列數(shù)
print(p2)
精細化空間圖譜
??同時公你,CARD還可以將細胞類型組成和基因表達水平歸到未測量的組織位置上踊淳,構(gòu)建更精細的空間組織圖。也就是如果空轉(zhuǎn)上沒測到的孔陕靠,可以用單細胞轉(zhuǎn)錄組數(shù)據(jù)將其還原迂尝,分析結(jié)果儲存在CARD_obj@refined_prop和CARD_obj@refined_expression中。
CARD_obj = CARD.imputation(CARD_obj,NumGrids = 2000,ineibor = 10,exclude = NULL)
## The rownames of locations are matched ...
## Make grids on new spatial locations ...
p5 <- CARD.visualize.prop(
proportion = CARD_obj@refined_prop,
spatial_location = location_imputation,
ct.visualize = ct.visualize,
colors = c("lightblue","lightyellow","red"),
NumCols = 4)
print(p5)
??CARD還可以提供在增強分辨率下的細胞類型比例后剪芥,預(yù)測增強分辨率下的空間基因表達垄开,與原始分辨率相比,CARD有助于構(gòu)建精細的空間組織圖税肪,其分辨率遠遠高于原始研究中測量到的分辨率溉躲。
###增強分辨率下的基因表達
p6 <- CARD.visualize.gene(
spatial_expression = CARD_obj@refined_expression,
spatial_location = location_imputation,
gene.visualize = c("Tm4sf1","S100a4","Tff3","Apol1","Crisp3","CD248"),
colors = NULL,
NumCols = 6)
print(p6)
###原始的基因表達
p7 <- CARD.visualize.gene(
spatial_expression = CARD_obj@spatial_countMat,
spatial_location = CARD_obj@spatial_location,
gene.visualize = c("Tm4sf1","S100a4","Tff3","Apol1","Crisp3","CD248"),
colors = NULL,
NumCols = 6)
print(p7)
不使用單細胞轉(zhuǎn)錄組數(shù)據(jù)進行去卷積
??開發(fā)者同時擴展了CARD用以支持無參考細胞類型的去卷積分析,簡稱為CARDfree益兄。CARDfree不再需要外部單細胞參考數(shù)據(jù)集锻梳,只需要用戶輸入之前已知的細胞類型標記的基因名稱列表。
wget https://github.com/YingMa0107/CARD/raw/master/data/markerList.RData#軟件提供的參考marker基因列表净捅,可以自行提供
###導(dǎo)入?yún)⒖糾arker基因列表
load("./markerList.RData")
CARDfree_obj = createCARDfreeObject(
markerList = markerList,
spatial_count = spatial_count,
spatial_location = spatial_location,
minCountGene = 100,
minCountSpot =5) ###創(chuàng)建CARDfree對象
CARDfree_obj = CARD_refFree(CARDfree_obj)
colors = c("#FFD92F","#4DAF4A","#FCCDE5","#D9D9D9","#377EB8","#7FC97F","#BEAED4","#FDC086","#FFFF99","#386CB0","#F0027F","#BF5B17","#666666","#1B9E77","#D95F02","#7570B3","#E7298A","#66A61E","#E6AB02","#A6761D")
CARDfree_obj@Proportion_CARD = CARDfree_obj@Proportion_CARD[,c(8,10,14,2,1,6,12,18,7,13,20,19,16,17,11,15,4,9,3,5)]
colnames(CARDfree_obj@Proportion_CARD) = paste0("CT",1:20)
p8 <- CARD.visualize.pie(CARDfree_obj@Proportion_CARD,CARDfree_obj@spatial_location,colors = colors)
print(p8)
繪制單細胞分辨率的空間圖譜
??此外疑枯,CARD還可以在非單細胞分辨率的空間轉(zhuǎn)錄組上構(gòu)建單細胞分辨率空間轉(zhuǎn)錄組。根據(jù)用于去卷積的參考單細胞轉(zhuǎn)錄組數(shù)據(jù)蛔六,從非單細胞分辨率空間轉(zhuǎn)錄組數(shù)據(jù)推斷出每個測量空間位置的單細胞分辨率基因表達神汹。
scMapping = CARD_SCMapping(CARD_obj,shapeSpot="Square",numCell=20,ncore=10)###shapeSpot:一個字符庆捺,指示單個細胞的采樣空間坐標是位于類方形區(qū)域還是類圓形區(qū)域。該區(qū)域的中心是在非單細胞分辨率空間轉(zhuǎn)錄組數(shù)據(jù)中測量的空間位置屁魏。默認為“Square”滔以,另一個選項為“Circle”
library(SingleCellExperiment)
MapCellCords = as.data.frame(colData(scMapping))
count_SC = assays(scMapping)$counts
library(ggplot2)
df = MapCellCords
colors = c("#8DD3C7","#CFECBB","#F4F4B9","#CFCCCF","#D1A7B9","#E9D3DE","#F4867C","#C0979F",
"#D5CFD6","#86B1CD","#CEB28B","#EDBC63","#C59CC5","#C09CBF","#C2D567","#C9DAC3","#E1EBA0",
"#FFED6F","#CDD796","#F8CDDE")
p9 = ggplot(df, aes(x = x, y = y, colour = CT)) +
geom_point(size = 3.0) +
scale_colour_manual(values = colors) +
#facet_wrap(~Method,ncol = 2,nrow = 3) +
theme(plot.margin = margin(0.1, 0.1, 0.1, 0.1, "cm"),
panel.background = element_rect(colour = "white", fill="white"),
plot.background = element_rect(colour = "white", fill="white"),
legend.position="bottom",
panel.border = element_rect(colour = "grey89", fill=NA, size=0.5),
axis.text =element_blank(),
axis.ticks =element_blank(),
axis.title =element_blank(),
legend.title=element_text(size = 13,face="bold"),
legend.text=element_text(size = 12),
legend.key = element_rect(colour = "transparent", fill = "white"),
legend.key.size = unit(0.45, 'cm'),
strip.text = element_text(size = 15,face="bold"))+
guides(color=guide_legend(title="Cell Type"))###可視化每個細胞的細胞類型及其空間位置信息
print(p9)
總結(jié)
??與其他去卷積分析軟件相比,CARD優(yōu)勢在于除了傳統(tǒng)基于單細胞轉(zhuǎn)錄組數(shù)據(jù)去卷積分析外氓拼,還可以基于細胞類型的marker基因進行分析你画,最后還可以繪制單細胞層次的空間圖譜。