新的一周揖庄,新的開(kāi)始阿逃,今天我們來(lái)分享一個(gè)cellphoneDB的升級(jí)版骡男,CellphoneDB的V3.0版本,升級(jí)的最大改進(jìn)在于服務(wù)于空間轉(zhuǎn)錄組的通訊分析夭谤,會(huì)結(jié)合空間位置和生態(tài)位進(jìn)行有效通訊的識(shí)別和判斷棺牧,我們來(lái)看一下。
關(guān)于CellphoneDB朗儒,大家應(yīng)該都不陌生颊乘,目前做通訊分析引用率最高的軟件,之前的版本升級(jí)到2之前都是服務(wù)于單細(xì)胞的數(shù)據(jù)分析的醉锄,但是現(xiàn)在作者將軟件升級(jí)到3乏悄,就是將空間轉(zhuǎn)錄組納入分析,非常棒的想法和運(yùn)用恳不,參考文章在Mapping the temporal and spatial dynamics of the human endometrium in vivo and in vitro,2021年12月份發(fā)表于Nature Genetics檩小,IF38.330,文獻(xiàn)的內(nèi)容我們下一篇再分享烟勋,這一片我們專注于CellphoneDB(V3.0)规求。
CellphoneDB(V3.0)的創(chuàng)新點(diǎn)
- 合并空間信息 CellPhoneDB 現(xiàn)在允許通過(guò)微環(huán)境文件合并細(xì)胞的空間信息筐付。 這是一個(gè)兩列文件,指示哪種細(xì)胞類型在哪個(gè)空間微環(huán)境中(參見(jiàn)示例見(jiàn)下圖)阻肿。 CellphoneDB 將使用此信息來(lái)定義可能的交互細(xì)胞對(duì)(即在微環(huán)境中共享/共存的集群對(duì))瓦戚。 可以使用 cell2location(關(guān)于cell2location分享了多次,大家可以參考文章 10X單細(xì)胞和空間聯(lián)合分析的方法---cell2location,10X單細(xì)胞空間聯(lián)合分析之再次解讀cell2location) 定義具有先驗(yàn)知識(shí)丛塌、成像或 Visium 分析的微環(huán)境较解。
cell_type | microenviroment |
---|---|
epi_Ciliated | Proliferative |
epi_Pre-ciliated | Proliferative |
epi_SOX9_LGR5 | Proliferative |
epi_SOX9_prolif | Proliferative |
epi_SOX9 | Proliferative |
Fibroblast eS | Proliferative |
Lymphoid | Proliferative |
Myeloid | Proliferative |
Fibroblast C7 | Proliferative |
epi_Ciliated | Secretory |
epi_Lumenal 1 | Secretory |
epi_Lumenal 2 | Secretory |
epi_Glandular | Secretory |
epi_Glandular_secretory | Secretory |
Fibroblast dS | Secretory |
Lymphoid | Secretory |
epi_SOX9 | Secretory |
Fibroblast C7 | Secretory |
Myeloid | Secretory |
- 添加了新的分析方法,使用差異表達(dá)基因 (DEG) 而不是 random shuffling(cellphonedb 方法 degs_analysis)姨伤。 這種方法將選擇所有基因都由高于--閾值的一小部分細(xì)胞表達(dá)并且至少一個(gè)基因是 DEG 的相互作用哨坪。 可以使用喜歡的工具識(shí)別 DEG,并通過(guò)文本文件將信息提供給 CellphoneDB乍楚。 第一列應(yīng)該是細(xì)胞類型/cluster当编,第二列應(yīng)該是相關(guān)的基因 id。 其余列將被忽略(參見(jiàn)示例見(jiàn)下表)徒溪。 這里提供了為 Seurat 和 Scanpy 的參考示例忿偷。
圖片.png
- Database update WNT pathway has been further curated.
圖片.png
先來(lái)看看R版本的分析(聯(lián)合Seurat)
library(Seurat)
library(SeuratObject)
library(Matrix)
so = readRDS('endometrium_example_counts_seurat.rds')
writeMM(so@assays$RNA@counts, file = 'endometrium_example_counts_mtx/matrix.mtx')
# save gene and cell names
write(x = rownames(so@assays$RNA@counts), file = "endometrium_example_counts_mtx/genes.tsv")
write(x = colnames(so@assays$RNA@counts), file = "endometrium_example_counts_mtx/barcodes.tsv")
so@meta.data$Cell = rownames(so@meta.data)
df = so@meta.data[, c('Cell', 'cell_type')]
write.table(df, file ='endometrium_example_meta.tsv', sep = '\t', quote = F, row.names = F)
## OPTION 1 - compute DEGs for all cell types
## Extract DEGs for each cell_type
# DEGs <- FindAllMarkers(so,
# test.use = 'LR',
# verbose = F,
# only.pos = T,
# random.seed = 1,
# logfc.threshold = 0.2,
# min.pct = 0.1,
# return.thresh = 0.05)
# OPTION 2 - optional - Re-compute hierarchical (per lineage) DEGs for Epithelial and Stromal lineages
DEGs = c()
for( lin in c('Epithelial', 'Stromal') ){
message('Computing DEGs within linage ', lin)
so_in_lineage = subset(so, cells = Cells(so)[ so$lineage == lin ] )
celltye_in_lineage = unique(so$cell_type[ so$lineage == lin ])
DEGs_lin = FindAllMarkers(so_in_lineage,
verbose = F,
only.pos = T,
random.seed = 1,
logfc.threshold = 0,
min.pct = 0.1,
return.thresh = 0.05)
DEGs = rbind(DEGs_lin, DEGs)
}
fDEGs = subset(DEGs, p_val_adj < 0.05 & avg_logFC > 0.1)
# 1st column = cluster; 2nd column = gene
fDEGs = fDEGs[, c('cluster', 'gene', 'p_val_adj', 'p_val', 'avg_logFC', 'pct.1', 'pct.2')]
write.table(fDEGs, file ='endometrium_example_DEGs.tsv', sep = '\t', quote = F, row.names = F)
Run cellphoneDB
cellphonedb method degs_analysis \
endometrium_example_meta.tsv \
endometrium_example_counts_mtx \
endometrium_example_DEGs.tsv \
--microenvs endometrium_example_microenviroments.tsv \ #optional
--counts-data hgnc_symbol \
--database database/database/cellphonedb_user_2021-06-29-11_41.db \
--threshold 0.1
python版本的分析
import numpy as np
import pandas as pd
import scanpy as sc
import anndata
import os
import sys
from scipy import sparse
sc.settings.verbosity = 1 # verbosity: errors (0), warnings (1), info (2), hints (3)
sys.executable
adata = sc.read('endometrium_example_counts.h5ad')
df_meta = pd.DataFrame(data={'Cell':list(adata.obs.index),
'cell_type':[ i for i in adata.obs['cell_type']]
})
df_meta.set_index('Cell', inplace=True)
df_meta.to_csv('endometrium_example_meta.tsv', sep = '\t')
# Conver to dense matrix for Seurat
adata.X = adata.X.toarray()
import rpy2.rinterface_lib.callbacks
import logging
# Ignore R warning messages
#Note: this can be commented out to get more verbose R output
rpy2.rinterface_lib.callbacks.logger.setLevel(logging.ERROR)
import anndata2ri
anndata2ri.activate()
%load_ext rpy2.ipython
%%R -o DEGs
library(Seurat)
so = as.Seurat(adata, counts = "X", data = "X")
Idents(so) = so$cell_type
## OPTION 1 - compute DEGs for all cell types
## Extract DEGs for each cell_type
# DEGs <- FindAllMarkers(so,
# test.use = 'LR',
# verbose = F,
# only.pos = T,
# random.seed = 1,
# logfc.threshold = 0.2,
# min.pct = 0.1,
# return.thresh = 0.05)
# OPTION 2 - optional - Re-compute hierarchical (per lineage) DEGs for Epithelial and Stromal lineages
DEGs = c()
for( lin in c('Epithelial', 'Stromal') ){
message('Computing DEGs within linage ', lin)
so_in_lineage = subset(so, cells = Cells(so)[ so$lineage == lin ] )
celltye_in_lineage = unique(so$cell_type[ so$lineage == lin ])
DEGs_lin = FindAllMarkers(so_in_lineage,
test.use = 'LR',
verbose = F,
only.pos = T,
random.seed = 1,
logfc.threshold = 0.2,
min.pct = 0.1,
return.thresh = 0.05)
DEGs = rbind(DEGs_lin, DEGs)
}
cond1 = DEGs['p_val_adj'] < 0.05
cond2 = DEGs['avg_log2FC'] > 0.1
mask = [all(tup) for tup in zip(cond1, cond2)]
fDEGs = DEGs[mask]
# 1st column = cluster; 2nd column = gene
fDEGs = fDEGs[['cluster', 'gene', 'p_val_adj', 'p_val', 'avg_log2FC', 'pct.1', 'pct.2']]
fDEGs.to_csv('endometrium_example_DEGs.tsv', index=False, sep='\t')
Run cellphoneDB
cellphonedb method degs_analysis \
endometrium_example_meta.tsv \
endometrium_example_counts.h5ad \
endometrium_example_DEGs.tsv \
--microenvs endometrium_example_microenviroments.tsv \
--counts-data hgnc_symbol \
--database database/database/cellphonedb_user_2021-06-29-11_41.db \
--threshold 0.1
圖片.png
至于cellphoneDB的繪圖操作,大家可以參考文章空間通訊分析章節(jié)2臊泌,軟件鏈接在CellphoneDB.
生活很好鲤桥,有你更好