作者舵盈,追風(fēng)少年i
周三了论矾,好好學(xué)習(xí),天天搬磚沛简,人生沒(méi)有倒帶齐鲤,前進(jìn)才是真理
接上一篇,10X單細(xì)胞(10X空間轉(zhuǎn)錄組)通訊分析之總結(jié)Nichenet生態(tài)位通訊比較分析椒楣,不同條件下樣本之間的差異通訊给郊,其實(shí)關(guān)于通訊最開(kāi)始的分析是細(xì)胞類(lèi)型之間,后來(lái)慢慢轉(zhuǎn)移到相對(duì)于normal捧灰,疾病之間通訊的變化淆九。
Differential NicheNet 的目標(biāo)是預(yù)測(cè)配體-受體對(duì),它們?cè)诓煌母信d趣的生態(tài)位之間都有差異表達(dá)和活躍。關(guān)于生態(tài)位炭庙,請(qǐng)參考10X單細(xì)胞(10X空間轉(zhuǎn)錄組)通訊分析之總結(jié)Nichenet生態(tài)位通訊比較分析跪另。
0. Read in the expression data of interest, and the NicheNet ligand-receptor network and ligand-target matrix
Load in packages
library(nichenetr)
library(RColorBrewer)
library(tidyverse)
library(Seurat)
讀取表達(dá)數(shù)據(jù),大家自己讀取分析的rds即可
seurat_obj = readRDS(url("https://zenodo.org/record/4675430/files/seurat_obj_hnscc.rds"))
DimPlot(seurat_obj, group.by = "celltype") # user adaptation required on own dataset
DimPlot(seurat_obj, group.by = "pEMT") # user adaptation required on own dataset
檢查細(xì)胞類(lèi)型的分布
table(seurat_obj@meta.data$celltype, seurat_obj@meta.data$pEMT) # cell types vs conditions # user adaptation required on own dataset
##
## High Low
## CAF 396 104
## Endothelial 105 53
## Malignant 1093 549
## Myeloid 92 7
## myofibroblast 382 61
## T.cell 689 3
對(duì)于Differential NicheNet煤搜,需要相互比較至少2 個(gè)生態(tài)位或條件免绿。 在這種情況下,兩個(gè)生態(tài)位是 pEMT-high-niche 和 pEMT-low-niche擦盾。 fenxi 將根據(jù)細(xì)胞類(lèi)型的樣本分布調(diào)整它們的名稱(chēng)嘲驾。
seurat_obj@meta.data$celltype_aggregate = paste(seurat_obj@meta.data$celltype, seurat_obj@meta.data$pEMT,sep = "_") # user adaptation required on own dataset
DimPlot(seurat_obj, group.by = "celltype_aggregate")
seurat_obj@meta.data$celltype_aggregate %>% table() %>% sort(decreasing = TRUE)
## .
## Malignant_High T.cell_High Malignant_Low CAF_High myofibroblast_High Endothelial_High CAF_Low
## 1093 689 549 396 382 105 104
## Myeloid_High myofibroblast_Low Endothelial_Low Myeloid_Low T.cell_Low
## 92 61 53 7 3
celltype_id = "celltype_aggregate" # metadata column name of the cell type of interest
seurat_obj = SetIdent(seurat_obj, value = seurat_obj[[celltype_id]])
Read in the NicheNet ligand-receptor network and ligand-target matrix(讀取數(shù)據(jù)庫(kù))
ligand_target_matrix = readRDS(url("https://zenodo.org/record/3260758/files/ligand_target_matrix.rds"))
ligand_target_matrix[1:5,1:5] # target genes in rows, ligands in columns
## CXCL1 CXCL2 CXCL3 CXCL5 PPBP
## A1BG 3.534343e-04 4.041324e-04 3.729920e-04 3.080640e-04 2.628388e-04
## A1BG-AS1 1.650894e-04 1.509213e-04 1.583594e-04 1.317253e-04 1.231819e-04
## A1CF 5.787175e-04 4.596295e-04 3.895907e-04 3.293275e-04 3.211944e-04
## A2M 6.027058e-04 5.996617e-04 5.164365e-04 4.517236e-04 4.590521e-04
## A2M-AS1 8.898724e-05 8.243341e-05 7.484018e-05 4.912514e-05 5.120439e-05
lr_network = readRDS(url("https://zenodo.org/record/3260758/files/lr_network.rds"))
lr_network = lr_network %>% mutate(bonafide = ! database %in% c("ppi_prediction","ppi_prediction_go"))
lr_network = lr_network %>% dplyr::rename(ligand = from, receptor = to) %>% distinct(ligand, receptor, bonafide)
head(lr_network)
## # A tibble: 6 x 3
## ligand receptor bonafide
## <chr> <chr> <lgl>
## 1 CXCL1 CXCR2 TRUE
## 2 CXCL2 CXCR2 TRUE
## 3 CXCL3 CXCR2 TRUE
## 4 CXCL5 CXCR2 TRUE
## 5 PPBP CXCR2 TRUE
## 6 CXCL6 CXCR2 TRUE
人鼠基因轉(zhuǎn)換
organism = "human" # user adaptation required on own dataset
if(organism == "mouse"){
lr_network = lr_network %>% mutate(ligand = convert_human_to_mouse_symbols(ligand), receptor = convert_human_to_mouse_symbols(receptor)) %>% drop_na()
colnames(ligand_target_matrix) = ligand_target_matrix %>% colnames() %>% convert_human_to_mouse_symbols()
rownames(ligand_target_matrix) = ligand_target_matrix %>% rownames() %>% convert_human_to_mouse_symbols()
ligand_target_matrix = ligand_target_matrix %>% .[!is.na(rownames(ligand_target_matrix)), !is.na(colnames(ligand_target_matrix))]
}
1、Define the niches/microenvironments of interest
每個(gè)生態(tài)位應(yīng)至少有一個(gè)“發(fā)送者/生態(tài)位”細(xì)胞群和一個(gè)“接收者/目標(biāo)”細(xì)胞群
在本案例研究中迹卢,希望發(fā)現(xiàn) pEMT 高和 pEMT 低腫瘤之間細(xì)胞間相互作用與惡性細(xì)胞的差異辽故。 因此,pEMT-High 生態(tài)位中的接收細(xì)胞群是“Malignant_High”細(xì)胞類(lèi)型腐碱,而在 pEMT-Low 生態(tài)位中誊垢,這是“Malignant_Low”。 感興趣的發(fā)送細(xì)胞群是肌成纖維細(xì)胞症见、內(nèi)皮細(xì)胞喂走、CAF、T 細(xì)胞和骨髓細(xì)胞谋作。 重要的是芋肠,只在 pEMT-High 生態(tài)位中包括 T.Cell 和 Myeloid,因?yàn)檫@些群體的細(xì)胞太少存在于 pEMT-low 生態(tài)位中遵蚜。 因此帖池,證明了在分析中包括特定條件細(xì)胞類(lèi)型的可能性——這是可能的,因?yàn)橛?jì)算了與其他生態(tài)位的所有發(fā)送細(xì)胞相比的 DE吭净,而不僅僅是同一細(xì)胞的 pEMT-low 細(xì)胞類(lèi)型睡汹。
niches = list(
"pEMT_High_niche" = list(
"sender" = c("myofibroblast_High", "Endothelial_High", "CAF_High", "T.cell_High", "Myeloid_High"),
"receiver" = c("Malignant_High")),
"pEMT_Low_niche" = list(
"sender" = c("myofibroblast_Low", "Endothelial_Low", "CAF_Low"),
"receiver" = c("Malignant_Low"))
) # user adaptation required on own dataset
2. Calculate differential expression between the niches
在這一步中,將為發(fā)送者和接收者確定不同生態(tài)位之間的 DE寂殉,以定義 L-R 對(duì)的 DE囚巴。
Calculate DE
計(jì)算 differential expression的方法在這里是標(biāo)準(zhǔn)的 Seurat Wilcoxon 檢驗(yàn),但如果需要不撑,可以將其替換(唯一要求:輸出表 DE_sender_processed 和 DE_receiver_processed 應(yīng)采用與此處所示相同的格式)文兢。
將為每個(gè)成對(duì)的發(fā)送者(或接收者)細(xì)胞類(lèi)型在生態(tài)位之間的比較計(jì)算 DE(因此跨生態(tài)位,而不是在生態(tài)位內(nèi))焕檬。 在分析的案例研究中姆坚,這意味著 myofibroblast_High 配體的 DE 將通過(guò) myofibroblast_High 與 myofibroblast_Low 的 DE 分析來(lái)計(jì)算; 肌成纖維細(xì)胞高 vs 內(nèi)皮低实愚; 和肌成纖維細(xì)胞高 vs CAF低兼呵。 根據(jù)細(xì)胞類(lèi)型拆分細(xì)胞兔辅,而不是合并來(lái)自其他生態(tài)位的所有細(xì)胞,以避免 DE 分析由最豐富的細(xì)胞類(lèi)型驅(qū)動(dòng)(和上一篇差異巨大的地方)击喂。
assay_oi = "SCT" # other possibilities: RNA,...
DE_sender = calculate_niche_de(seurat_obj = seurat_obj %>% subset(features = lr_network$ligand %>% unique()), niches = niches, type = "sender", assay_oi = assay_oi) # only ligands important for sender cell types
## [1] "Calculate Sender DE between: myofibroblast_High and myofibroblast_Low"
## [2] "Calculate Sender DE between: myofibroblast_High and Endothelial_Low"
## [3] "Calculate Sender DE between: myofibroblast_High and CAF_Low"
## [1] "Calculate Sender DE between: Endothelial_High and myofibroblast_Low"
## [2] "Calculate Sender DE between: Endothelial_High and Endothelial_Low"
## [3] "Calculate Sender DE between: Endothelial_High and CAF_Low"
## [1] "Calculate Sender DE between: CAF_High and myofibroblast_Low" "Calculate Sender DE between: CAF_High and Endothelial_Low"
## [3] "Calculate Sender DE between: CAF_High and CAF_Low"
## [1] "Calculate Sender DE between: T.cell_High and myofibroblast_Low" "Calculate Sender DE between: T.cell_High and Endothelial_Low"
## [3] "Calculate Sender DE between: T.cell_High and CAF_Low"
## [1] "Calculate Sender DE between: Myeloid_High and myofibroblast_Low" "Calculate Sender DE between: Myeloid_High and Endothelial_Low"
## [3] "Calculate Sender DE between: Myeloid_High and CAF_Low"
## [1] "Calculate Sender DE between: myofibroblast_Low and myofibroblast_High"
## [2] "Calculate Sender DE between: myofibroblast_Low and Endothelial_High"
## [3] "Calculate Sender DE between: myofibroblast_Low and CAF_High"
## [4] "Calculate Sender DE between: myofibroblast_Low and T.cell_High"
## [5] "Calculate Sender DE between: myofibroblast_Low and Myeloid_High"
## [1] "Calculate Sender DE between: Endothelial_Low and myofibroblast_High"
## [2] "Calculate Sender DE between: Endothelial_Low and Endothelial_High"
## [3] "Calculate Sender DE between: Endothelial_Low and CAF_High"
## [4] "Calculate Sender DE between: Endothelial_Low and T.cell_High"
## [5] "Calculate Sender DE between: Endothelial_Low and Myeloid_High"
## [1] "Calculate Sender DE between: CAF_Low and myofibroblast_High" "Calculate Sender DE between: CAF_Low and Endothelial_High"
## [3] "Calculate Sender DE between: CAF_Low and CAF_High" "Calculate Sender DE between: CAF_Low and T.cell_High"
## [5] "Calculate Sender DE between: CAF_Low and Myeloid_High"
DE_receiver = calculate_niche_de(seurat_obj = seurat_obj %>% subset(features = lr_network$receptor %>% unique()), niches = niches, type = "receiver", assay_oi = assay_oi) # only receptors now, later on: DE analysis to find targets
## # A tibble: 1 x 2
## receiver receiver_other_niche
## <chr> <chr>
## 1 Malignant_High Malignant_Low
## [1] "Calculate receiver DE between: Malignant_High and Malignant_Low"
## [1] "Calculate receiver DE between: Malignant_Low and Malignant_High"
DE_sender = DE_sender %>% mutate(avg_log2FC = ifelse(avg_log2FC == Inf, max(avg_log2FC[is.finite(avg_log2FC)]), ifelse(avg_log2FC == -Inf, min(avg_log2FC[is.finite(avg_log2FC)]), avg_log2FC)))
DE_receiver = DE_receiver %>% mutate(avg_log2FC = ifelse(avg_log2FC == Inf, max(avg_log2FC[is.finite(avg_log2FC)]), ifelse(avg_log2FC == -Inf, min(avg_log2FC[is.finite(avg_log2FC)]), avg_log2FC)))
Process DE results:
expression_pct = 0.10
DE_sender_processed = process_niche_de(DE_table = DE_sender, niches = niches, expression_pct = expression_pct, type = "sender")
DE_receiver_processed = process_niche_de(DE_table = DE_receiver, niches = niches, expression_pct = expression_pct, type = "receiver")
Combine sender-receiver DE based on L-R pairs:
如上所述维苔,來(lái)自一種配體細(xì)胞類(lèi)型的配體的 DE 確定為計(jì)算該細(xì)胞類(lèi)型與其他生態(tài)位的所有配體細(xì)胞類(lèi)型之間的 DE。 為了總結(jié)該細(xì)胞類(lèi)型配體的 DE懂昂,有幾個(gè)選擇:可以采用平均 LFC介时,但也可以采用與其他生態(tài)位相比的最小 LFC。 建議使用最小 LFC凌彬,因?yàn)檫@是對(duì)配體表達(dá)的最強(qiáng)特異性測(cè)量沸柔,因?yàn)楦?min LFC 意味著與所有生態(tài)位 2 的細(xì)胞類(lèi)型相比,配體在生態(tài)位 1 的細(xì)胞類(lèi)型中的表達(dá)更強(qiáng)(與 高平均 LFC铲敛,不排除生態(tài)位 2 中的一種或多種細(xì)胞類(lèi)型也強(qiáng)烈表達(dá)該配體)褐澎。
specificity_score_LR_pairs = "min_lfc"
DE_sender_receiver = combine_sender_receiver_de(DE_sender_processed, DE_receiver_processed, lr_network, specificity_score = specificity_score_LR_pairs)
3. Optional: Calculate differential expression between the different spatial regions(針對(duì)有空間的數(shù)據(jù))
為了改進(jìn)細(xì)胞-細(xì)胞相互作用預(yù)測(cè),如果可能且適用伐蒋,可以考慮空間信息工三。空間信息可以來(lái)自顯微鏡數(shù)據(jù)先鱼,也可以來(lái)自空間轉(zhuǎn)錄組學(xué)數(shù)據(jù)俭正,例如 Visium。
有幾種方法可以將空間信息合并到差分 NicheNet pipeline中型型。首先段审,如果細(xì)胞類(lèi)型位于相同的空間位置,則只能將它們視為屬于同一生態(tài)位闹蒜。另一種方法是在優(yōu)先框架中包括一種細(xì)胞類(lèi)型內(nèi)配體-受體對(duì)的空間差異表達(dá)。
例如:有一個(gè)細(xì)胞類(lèi)型 X抑淫,位于區(qū)域 A 和 B绷落,想研究區(qū)域 A 的細(xì)胞間通信。首先在生態(tài)位定義中只添加區(qū)域 A 的 celltypeX始苇,然后計(jì)算 celltypeX-regionA 之間的 DE和 celltypeX-regionB 為 regionA 特異性配體提供更高的優(yōu)先權(quán)砌烁。
在這個(gè)案例研究中,感興趣的區(qū)域是腫瘤前沿催式,因?yàn)閷⒃搮^(qū)域定義為對(duì) pEMT 過(guò)程很重要函喉。 還將 CAF 定義為靠近前緣的成纖維細(xì)胞,而其他成纖維細(xì)胞(肌成纖維細(xì)胞)不優(yōu)先位于腫瘤前緣荣月。因此管呵,現(xiàn)在可以通過(guò)查看前沿成纖維細(xì)胞 (=CAF) 和非前沿成纖維細(xì)胞 (肌成纖維細(xì)胞) 之間的 DE 配體來(lái)進(jìn)一步優(yōu)先考慮成纖維細(xì)胞配體。
We do this as follows, by first defining a ‘spatial info’ dataframe. If no spatial information in your data: set the following two parameters to FALSE, and make a mock ‘spatial_info’ data frame.
include_spatial_info_sender = TRUE # if not spatial info to include: put this to false # user adaptation required on own dataset
include_spatial_info_receiver = FALSE # if spatial info to include: put this to true # user adaptation required on own dataset
spatial_info = tibble(celltype_region_oi = "CAF_High", celltype_other_region = "myofibroblast_High", niche = "pEMT_High_niche", celltype_type = "sender") # user adaptation required on own dataset
specificity_score_spatial = "lfc"
# this is how this should be defined if you don't have spatial info
# mock spatial info
if(include_spatial_info_sender == FALSE & include_spatial_info_receiver == FALSE){
spatial_info = tibble(celltype_region_oi = NA, celltype_other_region = NA) %>% mutate(niche = niches %>% names() %>% head(1), celltype_type = "sender")
}
if(include_spatial_info_sender == TRUE){
sender_spatial_DE = calculate_spatial_DE(seurat_obj = seurat_obj %>% subset(features = lr_network$ligand %>% unique()), spatial_info = spatial_info %>% filter(celltype_type == "sender"))
sender_spatial_DE_processed = process_spatial_de(DE_table = sender_spatial_DE, type = "sender", lr_network = lr_network, expression_pct = expression_pct, specificity_score = specificity_score_spatial)
# add a neutral spatial score for sender celltypes in which the spatial is not known / not of importance
sender_spatial_DE_others = get_non_spatial_de(niches = niches, spatial_info = spatial_info, type = "sender", lr_network = lr_network)
sender_spatial_DE_processed = sender_spatial_DE_processed %>% bind_rows(sender_spatial_DE_others)
sender_spatial_DE_processed = sender_spatial_DE_processed %>% mutate(scaled_ligand_score_spatial = scale_quantile_adapted(ligand_score_spatial))
} else {
# # add a neutral spatial score for all sender celltypes (for none of them, spatial is relevant in this case)
sender_spatial_DE_processed = get_non_spatial_de(niches = niches, spatial_info = spatial_info, type = "sender", lr_network = lr_network)
sender_spatial_DE_processed = sender_spatial_DE_processed %>% mutate(scaled_ligand_score_spatial = scale_quantile_adapted(ligand_score_spatial))
}
## [1] "Calculate Spatial DE between: CAF_High and myofibroblast_High"
if(include_spatial_info_receiver == TRUE){
receiver_spatial_DE = calculate_spatial_DE(seurat_obj = seurat_obj %>% subset(features = lr_network$receptor %>% unique()), spatial_info = spatial_info %>% filter(celltype_type == "receiver"))
receiver_spatial_DE_processed = process_spatial_de(DE_table = receiver_spatial_DE, type = "receiver", lr_network = lr_network, expression_pct = expression_pct, specificity_score = specificity_score_spatial)
# add a neutral spatial score for receiver celltypes in which the spatial is not known / not of importance
receiver_spatial_DE_others = get_non_spatial_de(niches = niches, spatial_info = spatial_info, type = "receiver", lr_network = lr_network)
receiver_spatial_DE_processed = receiver_spatial_DE_processed %>% bind_rows(receiver_spatial_DE_others)
receiver_spatial_DE_processed = receiver_spatial_DE_processed %>% mutate(scaled_receptor_score_spatial = scale_quantile_adapted(receptor_score_spatial))
} else {
# # add a neutral spatial score for all receiver celltypes (for none of them, spatial is relevant in this case)
receiver_spatial_DE_processed = get_non_spatial_de(niches = niches, spatial_info = spatial_info, type = "receiver", lr_network = lr_network)
receiver_spatial_DE_processed = receiver_spatial_DE_processed %>% mutate(scaled_receptor_score_spatial = scale_quantile_adapted(receptor_score_spatial))
}
4. Calculate ligand activities and infer active ligand-target links
在這一步中哺窄,將預(yù)測(cè)每個(gè)配體在不同生態(tài)位中的每種受體細(xì)胞類(lèi)型的配體活性捐下。 這類(lèi)似于在正常 NicheNet pipeline中進(jìn)行的配體活性分析账锹。
為了計(jì)算配體活性,首先需要為每個(gè)生態(tài)位定義一個(gè)感興趣的geneset坷襟。 在這個(gè)案例研究中奸柬,與 pEMT 低腫瘤相比,pEMT 高生態(tài)位的geneset是 pEMT 高腫瘤中上調(diào)的基因婴程,反之亦然廓奕。
還可以以不同的方式定義這些感興趣的基因集!
lfc_cutoff = 0.15 # recommended for 10x as min_lfc cutoff.
specificity_score_targets = "min_lfc"
DE_receiver_targets = calculate_niche_de_targets(seurat_obj = seurat_obj, niches = niches, lfc_cutoff = lfc_cutoff, expression_pct = expression_pct, assay_oi = assay_oi)
## [1] "Calculate receiver DE between: Malignant_High and Malignant_Low"
## [1] "Calculate receiver DE between: Malignant_Low and Malignant_High"
DE_receiver_processed_targets = process_receiver_target_de(DE_receiver_targets = DE_receiver_targets, niches = niches, expression_pct = expression_pct, specificity_score = specificity_score_targets)
background = DE_receiver_processed_targets %>% pull(target) %>% unique()
geneset_niche1 = DE_receiver_processed_targets %>% filter(receiver == niches[[1]]$receiver & target_score >= lfc_cutoff & target_significant == 1 & target_present == 1) %>% pull(target) %>% unique()
geneset_niche2 = DE_receiver_processed_targets %>% filter(receiver == niches[[2]]$receiver & target_score >= lfc_cutoff & target_significant == 1 & target_present == 1) %>% pull(target) %>% unique()
# Good idea to check which genes will be left out of the ligand activity analysis (=when not present in the rownames of the ligand-target matrix).
# If many genes are left out, this might point to some issue in the gene naming (eg gene aliases and old gene symbols, bad human-mouse mapping)
geneset_niche1 %>% setdiff(rownames(ligand_target_matrix))
## [1] "ANXA8L2" "PRKCDBP" "IL8" "PTRF" "SEPP1" "C1orf186" "CCDC109B" "C10orf54"
## [9] "LEPREL1" "ZNF812" "LOC645638" "LOC401397" "LINC00162" "DFNA5" "PLK1S1" "ZMYM6NB"
## [17] "C19orf10" "CTSL1" "SQRDL" "LOC375295" "WBP5" "LOC100505633" "AIM1" "C1orf63"
## [25] "LOC100507463" "GPR115" "VIMP" "SEP15" "C1orf172" "NAPRT1" "LHFP" "KRT16P1"
## [33] "C7orf10" "PTPLA" "GRAMD3" "CPSF3L" "MESDC2" "C10orf10" "KIAA1609" "CCDC53"
## [41] "TXLNG2P" "NGFRAP1" "ERO1L" "FAM134A" "LSMD1" "TCEB2" "B3GALTL" "HN1L"
## [49] "LOC550643" "KIAA0922" "GLT25D1" "FAM127A" "C1orf151-NBL1" "SEPW1" "GPR126" "LOC100505806"
## [57] "LINC00478" "TCEB1" "GRAMD2" "GNB2L1" "KIRREL"
geneset_niche2 %>% setdiff(rownames(ligand_target_matrix))
## [1] "LOC344887" "AGPAT9" "C1orf110" "KIAA1467" "LOC100292680" "EPT1" "CT45A4" "LOC654433"
## [9] "UPK3BL" "LINC00340" "LOC100128338" "FAM60A" "CCDC144C" "LOC401109" "LOC286467" "LEPREL4"
## [17] "LOC731275" "LOC642236" "LINC00516" "LOC101101776" "SC5DL" "PVRL4" "LOC100130093" "LINC00338"
## [25] "LOC100132891" "PPAP2C" "C6orf1" "C2orf47" "WHSC1L1" "LOC100289019" "SETD8" "KDM5B-AS1"
## [33] "SPG20" "CXCR7" "LOC100216479" "LOC100505761" "MGC57346" "LPHN3" "CENPC1" "C11orf93"
## [41] "C14orf169" "LOC100506060" "FLJ31485" "LOC440905" "MLF1IP" "TMEM194A" "RRP7B" "REXO1L1"
## [49] "LOC100129269" "KIAA1715" "CTAGE5" "LOC202781" "LOC100506714" "LOC401164" "UTS2D" "LOC146880"
## [57] "KIAA1804" "C5orf55" "C21orf119" "PRUNE" "LRRC16A" "LOC339240" "FLJ35024" "C5orf28"
## [65] "LOC100505876" "MGC21881" "LOC100133985" "PPAPDC2" "FRG1B" "CECR5" "LOC100129361" "CCBL1"
## [73] "PTPLAD1" "MST4" "LOC550112" "LOC389791" "CCDC90A" "KIAA0195" "LOC100506469" "LOC100133161"
## [81] "LOC646719" "LOC728819" "BRE" "LOC284581" "LOC441081" "LOC728377" "LOC100134229" "C3orf65"
## [89] "SMEK2" "KIAA1737" "C17orf70" "PLEKHM1P" "LOC338758" "PCNXL2" "LOC91948" "C17orf89"
## [97] "LOC100505783" "SMCR7L" "C8orf4" "GPR56" "ATHL1" "LOC339535" "PPAPDC1B" "DAK"
## [105] "LOC100507173" "CRHR1-IT1" "PPAP2B" "ADCK4" "KIAA0146" "GYLTL1B" "LOC100272216" "LOC400027"
## [113] "WHSC1" "LOC100130855" "C7orf55" "C19orf40" "ADCK3" "C9orf142" "SGOL1" "LOC90834"
## [121] "PTPLAD2" "KIAA1967" "LOC100132352" "LOC100630918" "ADRBK2" "LINC00263" "FAM64A" "LOC401074"
## [129] "FAM179B" "RP1-177G6.2" "METTL21D" "ERO1LB" "FLJ45445" "NADKD1" "LOC100506233" "LOC100652772"
## [137] "FAM175A" "LINC00630" "C11orf82" "SETD5-AS1" "SGK196" "FLJ14186" "CCDC104" "FAM63A"
## [145] "NARG2" "MTERFD1" "CCDC74B-AS1" "LOC286186" "WDR67" "C12orf52" "FLJ30403" "KIAA2018"
## [153] "GCN1L1" "FLJ43681" "LOC152217" "FONG" "C18orf8" "ALG1L9P" "GTDC2" "LOC100507217"
## [161] "NBPF24" "WBSCR27" "C14orf1" "LOC284889" "KIAA0317" "FAM65A" "PMS2L2" "LUST"
## [169] "C15orf52" "FAM195A" "LOC399744" "PYCRL" "LOC338799" "LOC100506190" "C9orf91" "FLJ45340"
## [177] "LOC349196" "LOC100128881" "TOMM70A" "ALS2CR8" "LDOC1L" "HDGFRP3" "ZNF767" "LOC728558"
## [185] "LOC283693" "LEPREL2" "QTRTD1" "SELM" "C6orf25" "C1orf86" "HNRPLL" "LOC145820"
## [193] "LOC100289341" "C17orf85" "C3orf72" "C14orf64" "C9orf9" "LOC100506394"
length(geneset_niche1)
## [1] 1668
length(geneset_niche2)
## [1] 2889
在進(jìn)行配體活性分析之前檢查geneset中的基因數(shù)量档叔。 建議在感興趣的基因集中包含 20 到 1000 個(gè)基因桌粉,以及至少 5000 個(gè)基因的背景,以便進(jìn)行適當(dāng)?shù)呐潴w活性分析蹲蒲。 如果檢索到的 DE 基因過(guò)多番甩,建議使用更高的 lfc_cutoff 閾值。 如果有 > 2 個(gè)接收細(xì)胞/niche來(lái)比較并使用 min_lfc 作為特異性得分届搁,建議使用 0.15 的截止值缘薛。 如果只有 2 個(gè)receiver/niche,建議使用更高的閾值(例如使用 0.25)卡睦。 如果有 Smart-seq2 等具有高測(cè)序深度的單細(xì)胞數(shù)據(jù)宴胧,建議也使用更高的閾值。
As we see here, we have Smart-seq2 data and only 2 niches to compare, so we will use a stronger LFC threshold to keep less DE genes, but more trustworthy ones.
lfc_cutoff = 0.75
specificity_score_targets = "min_lfc"
DE_receiver_processed_targets = process_receiver_target_de(DE_receiver_targets = DE_receiver_targets, niches = niches, expression_pct = expression_pct, specificity_score = specificity_score_targets)
background = DE_receiver_processed_targets %>% pull(target) %>% unique()
geneset_niche1 = DE_receiver_processed_targets %>% filter(receiver == niches[[1]]$receiver & target_score >= lfc_cutoff & target_significant == 1 & target_present == 1) %>% pull(target) %>% unique()
geneset_niche2 = DE_receiver_processed_targets %>% filter(receiver == niches[[2]]$receiver & target_score >= lfc_cutoff & target_significant == 1 & target_present == 1) %>% pull(target) %>% unique()
# Good idea to check which genes will be left out of the ligand activity analysis (=when not present in the rownames of the ligand-target matrix).
# If many genes are left out, this might point to some issue in the gene naming (eg gene aliases and old gene symbols, bad human-mouse mapping)
geneset_niche1 %>% setdiff(rownames(ligand_target_matrix))
## [1] "ANXA8L2" "PRKCDBP" "IL8" "PTRF" "SEPP1" "C1orf186"
geneset_niche2 %>% setdiff(rownames(ligand_target_matrix))
## [1] "LOC344887" "AGPAT9" "C1orf110" "KIAA1467" "LOC100292680" "EPT1" "CT45A4"
length(geneset_niche1)
## [1] 169
length(geneset_niche2)
## [1] 136
top_n_target = 250
niche_geneset_list = list(
"pEMT_High_niche" = list(
"receiver" = niches[[1]]$receiver,
"geneset" = geneset_niche1,
"background" = background),
"pEMT_Low_niche" = list(
"receiver" = niches[[2]]$receiver,
"geneset" = geneset_niche2 ,
"background" = background)
)
ligand_activities_targets = get_ligand_activities_targets(niche_geneset_list = niche_geneset_list, ligand_target_matrix = ligand_target_matrix, top_n_target = top_n_target)
## [1] "Calculate Ligand activities for: Malignant_High"
## [1] "Calculate Ligand activities for: Malignant_Low"
5. Calculate (scaled) expression of ligands, receptors and targets across cell types of interest (log expression values and expression fractions)
在這一步中表锻,將計(jì)算所有感興趣的細(xì)胞類(lèi)型的配體恕齐、受體和靶基因的平均(縮放)表達(dá)和表達(dá)分?jǐn)?shù)。 現(xiàn)在瞬逊,這通過(guò) Seurat 的 DotPlot 函數(shù)進(jìn)行了演示显歧,但當(dāng)然也可以通過(guò)其他方式完成。
features_oi = union(lr_network$ligand, lr_network$receptor) %>% union(ligand_activities_targets$target) %>% setdiff(NA)
dotplot = suppressWarnings(Seurat::DotPlot(seurat_obj %>% subset(idents = niches %>% unlist() %>% unique()), features = features_oi, assay = assay_oi))
exprs_tbl = dotplot$data %>% as_tibble()
exprs_tbl = exprs_tbl %>% rename(celltype = id, gene = features.plot, expression = avg.exp, expression_scaled = avg.exp.scaled, fraction = pct.exp) %>%
mutate(fraction = fraction/100) %>% as_tibble() %>% select(celltype, gene, expression, expression_scaled, fraction) %>% distinct() %>% arrange(gene) %>% mutate(gene = as.character(gene))
exprs_tbl_ligand = exprs_tbl %>% filter(gene %in% lr_network$ligand) %>% rename(sender = celltype, ligand = gene, ligand_expression = expression, ligand_expression_scaled = expression_scaled, ligand_fraction = fraction)
exprs_tbl_receptor = exprs_tbl %>% filter(gene %in% lr_network$receptor) %>% rename(receiver = celltype, receptor = gene, receptor_expression = expression, receptor_expression_scaled = expression_scaled, receptor_fraction = fraction)
exprs_tbl_target = exprs_tbl %>% filter(gene %in% ligand_activities_targets$target) %>% rename(receiver = celltype, target = gene, target_expression = expression, target_expression_scaled = expression_scaled, target_fraction = fraction)
exprs_tbl_ligand = exprs_tbl_ligand %>% mutate(scaled_ligand_expression_scaled = scale_quantile_adapted(ligand_expression_scaled)) %>% mutate(ligand_fraction_adapted = ligand_fraction) %>% mutate_cond(ligand_fraction >= expression_pct, ligand_fraction_adapted = expression_pct) %>% mutate(scaled_ligand_fraction_adapted = scale_quantile_adapted(ligand_fraction_adapted))
exprs_tbl_receptor = exprs_tbl_receptor %>% mutate(scaled_receptor_expression_scaled = scale_quantile_adapted(receptor_expression_scaled)) %>% mutate(receptor_fraction_adapted = receptor_fraction) %>% mutate_cond(receptor_fraction >= expression_pct, receptor_fraction_adapted = expression_pct) %>% mutate(scaled_receptor_fraction_adapted = scale_quantile_adapted(receptor_fraction_adapted))
6. Expression fraction and receptor
在這一步中确镊,將根據(jù)受體的表達(dá)強(qiáng)度對(duì)配體-受體相互作用進(jìn)行評(píng)分士骤,這樣就可以對(duì)特定細(xì)胞類(lèi)型中特定配體的最強(qiáng)表達(dá)受體給予更高的分?jǐn)?shù)。 這不會(huì)影響以后單個(gè)配體的排名蕾域,但將有助于優(yōu)先考慮每個(gè)配體最重要的受體拷肌。
exprs_sender_receiver = lr_network %>%
inner_join(exprs_tbl_ligand, by = c("ligand")) %>%
inner_join(exprs_tbl_receptor, by = c("receptor")) %>% inner_join(DE_sender_receiver %>% distinct(niche, sender, receiver))
ligand_scaled_receptor_expression_fraction_df = exprs_sender_receiver %>% group_by(ligand, receiver) %>% mutate(rank_receptor_expression = dense_rank(receptor_expression), rank_receptor_fraction = dense_rank(receptor_fraction)) %>% mutate(ligand_scaled_receptor_expression_fraction = 0.5*( (rank_receptor_fraction / max(rank_receptor_fraction)) + ((rank_receptor_expression / max(rank_receptor_expression))) ) ) %>% distinct(ligand, receptor, receiver, ligand_scaled_receptor_expression_fraction, bonafide) %>% distinct() %>% ungroup()
7. Prioritization of ligand-receptor and ligand-target links
In this step, we will combine all the above calculated information to prioritize ligand-receptor-target links. We scale every property of interest between 0 and 1, and the final prioritization score is a weighted sum of the scaled scores of all the properties of interest.
We provide the user the option to consider the following properties for prioritization (of which the weights are defined in prioritizing_weights) :
Ligand DE score: niche-specific expression of the ligand: by default, this the minimum logFC between the sender of interest and all the senders of the other niche(s). The higher the min logFC, the higher the niche-specificity of the ligand. Therefore we recommend to give this factor a very high weight. prioritizing_weights argument: "scaled_ligand_score". Recommended weight: 5 (at least 1, max 5).
Scaled ligand expression: scaled expression of a ligand in one sender compared to the other cell types in the dataset. This might be useful to rescue potentially interesting ligands that have a high scaled expression value, but a relatively small min logFC compared to the other niche. One reason why this logFC might be small occurs when (some) genes are not picked up efficiently by the used sequencing technology (or other reasons for low RNA expression of ligands). For example, we have observed that many ligands from the Tgf-beta/BMP family are not picked up efficiently with single-nuclei RNA sequencing compared to single-cell sequencing. prioritizing_weights argument: "scaled_ligand_expression_scaled". Recommended weight: 1 (unless technical reason for lower gene detection such as while using Nuc-seq: then recommended to use a higher weight: 2).
Ligand expression fraction: Ligands that are expressed in a smaller fraction of cells of a cell type than defined by exprs_cutoff(default: 0.10) will get a lower ranking, proportional to their fraction (eg ligand expressed in 9% of cells will be ranked higher than ligand expressed in 0.5% of cells). We opted for this weighting based on fraction, instead of removing ligands that are not expressed in more cells than this cutoff, because some interesting ligands could be removed that way. Fraction of expression is not taken into account for the prioritization if it is already higher than the cutoff. prioritizing_weights argument: "ligand_fraction". Recommended weight: 1.
Ligand spatial DE score: spatial expression specificity of the ligand. If the niche of interest is at a specific tissue location, but some of the sender cell types of that niche are also present in other locations, it can be very informative to further prioritize ligands of that sender by looking how they are DE between the spatial location of interest compared to the other locations. prioritizing_weights argument: "scaled_ligand_score_spatial". Recommended weight: 2 (or 0 if not applicable).
Receptor DE score: niche-specific expression of the receptor: by default, this the minimum logFC between the receiver of interest and all the receiver of the other niche(s). The higher the min logFC, the higher the niche-specificity of the receptor. Based on our experience, we don’t suggest to give this as high importance as the ligand DE, but this might depend on the specific case study. prioritizing_weights argument: "scaled_receptor_score". Recommended weight: 0.5 (at least 0.5, and lower than "scaled_ligand_score").
Scaled receptor expression: scaled expression of a receptor in one receiver compared to the other cell types in the dataset. This might be useful to rescue potentially interesting receptors that have a high scaled expression value, but a relatively small min logFC compared to the other niche. One reason why this logFC might be small occurs when (some) genes are not picked up efficiently by the used sequencing technology. prioritizing_weights argument: "scaled_receptor_expression_scaled". Recommended weight: 0.5.
Receptor expression fraction: Receptors that are expressed in a smaller fraction of cells of a cell type than defined by exprs_cutoff(default: 0.10) will get a lower ranking, proportional to their fraction (eg receptor expressed in 9% of cells will be ranked higher than receptor expressed in 0.5% of cells). We opted for this weighting based on fraction, instead of removing receptors that are not expressed in more cells than this cutoff, because some interesting receptors could be removed that way. Fraction of expression is not taken into account for the prioritization if it is already higher than the cutoff. prioritizing_weights argument: "receptor_fraction". Recommended weight: 1.
Receptor expression strength: this factor let us give higher weights to the most highly expressed receptor of a ligand in the receiver. This let us rank higher one member of a receptor family if it higher expressed than the other members. prioritizing_weights argument: "ligand_scaled_receptor_expression_fraction". Recommended value: 1 (minimum: 0.5).
Receptor spatial DE score: spatial expression specificity of the receptor. If the niche of interest is at a specific tissue location, but the receiver cell type of that niche is also present in other locations, it can be very informative to further prioritize receptors of that receiver by looking how they are DE between the spatial location of interest compared to the other locations. prioritizing_weights argument: "scaled_receptor_score_spatial". Recommended weight: 1 (or 0 if not applicable).
Absolute ligand activity: to further prioritize ligand-receptor pairs based on their predicted effect of the ligand-receptor interaction on the gene expression in the receiver cell type - absolute ligand activity accords to ‘a(chǎn)bsolute’ enrichment of target genes of a ligand within the affected receiver genes. prioritizing_weights argument: "scaled_activity". Recommended weight: 0, unless absolute enrichment of target genes is of specific interest.
Normalized ligand activity: to further prioritize ligand-receptor pairs based on their predicted effect of the ligand-receptor interaction on the gene expression in the receiver cell type - normalization of activity is done because we found that some datasets/conditions/niches have higher baseline activity values than others - normalized ligand activity accords to ‘relative’ enrichment of target genes of a ligand within the affected receiver genes. prioritizing_weights argument: "scaled_activity_normalized". Recommended weight: at least 1.
Prior knowledge quality of the L-R interaction: the NicheNet LR network consists of two types of interactions: L-R pairs documented in curated databases, and L-R pairs predicted based on gene annotation and PPIs. The former are categorized as ‘bona fide’ interactions. To rank bona fide interactions higher, but not exlude potentially interesting non-bona-fide ones, we give bona fide interactions a score of 1, and non-bona-fide interactions a score fof 0.5. prioritizing_weights argument: "bona_fide" Recommend weight: at least 1.
prioritizing_weights = c("scaled_ligand_score" = 5,
"scaled_ligand_expression_scaled" = 1,
"ligand_fraction" = 1,
"scaled_ligand_score_spatial" = 2,
"scaled_receptor_score" = 0.5,
"scaled_receptor_expression_scaled" = 0.5,
"receptor_fraction" = 1,
"ligand_scaled_receptor_expression_fraction" = 1,
"scaled_receptor_score_spatial" = 0,
"scaled_activity" = 0,
"scaled_activity_normalized" = 1,
"bona_fide" = 1)
Note: these settings will give substantially more weight to DE ligand-receptor pairs compared to activity. Users can change this if wanted, just like other settings can be changed if that would be better to tackle the specific biological question you want to address.
output = list(DE_sender_receiver = DE_sender_receiver, ligand_scaled_receptor_expression_fraction_df = ligand_scaled_receptor_expression_fraction_df, sender_spatial_DE_processed = sender_spatial_DE_processed, receiver_spatial_DE_processed = receiver_spatial_DE_processed,
ligand_activities_targets = ligand_activities_targets, DE_receiver_processed_targets = DE_receiver_processed_targets, exprs_tbl_ligand = exprs_tbl_ligand, exprs_tbl_receptor = exprs_tbl_receptor, exprs_tbl_target = exprs_tbl_target)
prioritization_tables = get_prioritization_tables(output, prioritizing_weights)
prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(receiver == niches[[1]]$receiver) %>% head(10)
## # A tibble: 10 x 37
## niche receiver sender ligand_receptor ligand receptor bonafide ligand_score ligand_signific~ ligand_present ligand_expressi~
## <chr> <chr> <chr> <chr> <chr> <chr> <lgl> <dbl> <dbl> <dbl> <dbl>
## 1 pEMT_High_niche Malignan~ T.cel~ PTPRC--MET PTPRC MET FALSE 3.22 1 1 9.32
## 2 pEMT_High_niche Malignan~ T.cel~ PTPRC--EGFR PTPRC EGFR FALSE 3.22 1 1 9.32
## 3 pEMT_High_niche Malignan~ T.cel~ PTPRC--CD44 PTPRC CD44 FALSE 3.22 1 1 9.32
## 4 pEMT_High_niche Malignan~ T.cel~ PTPRC--ERBB2 PTPRC ERBB2 FALSE 3.22 1 1 9.32
## 5 pEMT_High_niche Malignan~ T.cel~ PTPRC--IFNAR1 PTPRC IFNAR1 FALSE 3.22 1 1 9.32
## 6 pEMT_High_niche Malignan~ T.cel~ TNF--TNFRSF21 TNF TNFRSF21 TRUE 1.74 1 1 2.34
## 7 pEMT_High_niche Malignan~ Myelo~ SERPINA1--LRP1 SERPI~ LRP1 TRUE 2.52 1 1 4.83
## 8 pEMT_High_niche Malignan~ Myelo~ IL1B--IL1RAP IL1B IL1RAP TRUE 1.50 1 1 1.93
## 9 pEMT_High_niche Malignan~ Myelo~ IL1RN--IL1R2 IL1RN IL1R2 TRUE 1.62 1 1 2.07
## 10 pEMT_High_niche Malignan~ T.cel~ PTPRC--INSR PTPRC INSR FALSE 3.22 1 1 9.32
## # ... with 26 more variables: ligand_expression_scaled <dbl>, ligand_fraction <dbl>, ligand_score_spatial <dbl>, receptor_score <dbl>,
## # receptor_significant <dbl>, receptor_present <dbl>, receptor_expression <dbl>, receptor_expression_scaled <dbl>,
## # receptor_fraction <dbl>, receptor_score_spatial <dbl>, ligand_scaled_receptor_expression_fraction <dbl>,
## # avg_score_ligand_receptor <dbl>, activity <dbl>, activity_normalized <dbl>, scaled_ligand_score <dbl>,
## # scaled_ligand_expression_scaled <dbl>, scaled_receptor_score <dbl>, scaled_receptor_expression_scaled <dbl>,
## # scaled_avg_score_ligand_receptor <dbl>, scaled_ligand_score_spatial <dbl>, scaled_receptor_score_spatial <dbl>,
## # scaled_ligand_fraction_adapted <dbl>, scaled_receptor_fraction_adapted <dbl>, scaled_activity <dbl>, ...
prioritization_tables$prioritization_tbl_ligand_target %>% filter(receiver == niches[[1]]$receiver) %>% head(10)
## # A tibble: 10 x 20
## niche receiver sender ligand_receptor ligand receptor bonafide target target_score target_signific~ target_present target_expressi~
## <chr> <chr> <chr> <chr> <chr> <chr> <lgl> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 pEMT_H~ Malignan~ T.cell~ PTPRC--MET PTPRC MET FALSE EHF 1.04 1 1 1.88
## 2 pEMT_H~ Malignan~ T.cell~ PTPRC--MET PTPRC MET FALSE GADD4~ 0.836 1 1 2.42
## 3 pEMT_H~ Malignan~ T.cell~ PTPRC--MET PTPRC MET FALSE SERPI~ 0.889 1 1 1.79
## 4 pEMT_H~ Malignan~ T.cell~ PTPRC--EGFR PTPRC EGFR FALSE EHF 1.04 1 1 1.88
## 5 pEMT_H~ Malignan~ T.cell~ PTPRC--EGFR PTPRC EGFR FALSE GADD4~ 0.836 1 1 2.42
## 6 pEMT_H~ Malignan~ T.cell~ PTPRC--EGFR PTPRC EGFR FALSE SERPI~ 0.889 1 1 1.79
## 7 pEMT_H~ Malignan~ T.cell~ PTPRC--CD44 PTPRC CD44 FALSE EHF 1.04 1 1 1.88
## 8 pEMT_H~ Malignan~ T.cell~ PTPRC--CD44 PTPRC CD44 FALSE GADD4~ 0.836 1 1 2.42
## 9 pEMT_H~ Malignan~ T.cell~ PTPRC--CD44 PTPRC CD44 FALSE SERPI~ 0.889 1 1 1.79
## 10 pEMT_H~ Malignan~ T.cell~ PTPRC--ERBB2 PTPRC ERBB2 FALSE EHF 1.04 1 1 1.88
## # ... with 8 more variables: target_expression_scaled <dbl>, target_fraction <dbl>, ligand_target_weight <dbl>, activity <dbl>,
## # activity_normalized <dbl>, scaled_activity <dbl>, scaled_activity_normalized <dbl>, prioritization_score <dbl>
prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(receiver == niches[[2]]$receiver) %>% head(10)
## # A tibble: 10 x 37
## niche receiver sender ligand_receptor ligand receptor bonafide ligand_score ligand_signific~ ligand_present ligand_expressi~
## <chr> <chr> <chr> <chr> <chr> <chr> <lgl> <dbl> <dbl> <dbl> <dbl>
## 1 pEMT_Low_niche Malignan~ Endoth~ F8--LRP1 F8 LRP1 TRUE 0.952 1 1 2.17
## 2 pEMT_Low_niche Malignan~ Endoth~ PLAT--LRP1 PLAT LRP1 TRUE 0.913 1 1 2.70
## 3 pEMT_Low_niche Malignan~ CAF_Low FGF10--FGFR2 FGF10 FGFR2 TRUE 0.385 0.8 1 1.07
## 4 pEMT_Low_niche Malignan~ CAF_Low NLGN2--NRXN3 NLGN2 NRXN3 TRUE 0.140 0.2 1 0.269
## 5 pEMT_Low_niche Malignan~ CAF_Low RSPO3--LGR6 RSPO3 LGR6 TRUE 0.557 0.8 1 1.27
## 6 pEMT_Low_niche Malignan~ CAF_Low COMP--SDC1 COMP SDC1 TRUE 0.290 0.8 1 1.27
## 7 pEMT_Low_niche Malignan~ CAF_Low SEMA3C--NRP2 SEMA3C NRP2 TRUE 0.652 1 1 1.73
## 8 pEMT_Low_niche Malignan~ CAF_Low SLIT2--SDC1 SLIT2 SDC1 TRUE 0.494 1 1 0.846
## 9 pEMT_Low_niche Malignan~ Endoth~ IL33--IL1RAP IL33 IL1RAP FALSE 1.34 1 1 2.75
## 10 pEMT_Low_niche Malignan~ CAF_Low C3--LRP1 C3 LRP1 TRUE 0.480 1 1 4.79
## # ... with 26 more variables: ligand_expression_scaled <dbl>, ligand_fraction <dbl>, ligand_score_spatial <dbl>, receptor_score <dbl>,
## # receptor_significant <dbl>, receptor_present <dbl>, receptor_expression <dbl>, receptor_expression_scaled <dbl>,
## # receptor_fraction <dbl>, receptor_score_spatial <dbl>, ligand_scaled_receptor_expression_fraction <dbl>,
## # avg_score_ligand_receptor <dbl>, activity <dbl>, activity_normalized <dbl>, scaled_ligand_score <dbl>,
## # scaled_ligand_expression_scaled <dbl>, scaled_receptor_score <dbl>, scaled_receptor_expression_scaled <dbl>,
## # scaled_avg_score_ligand_receptor <dbl>, scaled_ligand_score_spatial <dbl>, scaled_receptor_score_spatial <dbl>,
## # scaled_ligand_fraction_adapted <dbl>, scaled_receptor_fraction_adapted <dbl>, scaled_activity <dbl>, ...
prioritization_tables$prioritization_tbl_ligand_target %>% filter(receiver == niches[[2]]$receiver) %>% head(10)
## # A tibble: 10 x 20
## niche receiver sender ligand_receptor ligand receptor bonafide target target_score target_signific~ target_present target_expressi~
## <chr> <chr> <chr> <chr> <chr> <chr> <lgl> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 pEMT_L~ Malignan~ Endoth~ F8--LRP1 F8 LRP1 TRUE ETV4 0.771 1 1 1.00
## 2 pEMT_L~ Malignan~ Endoth~ PLAT--LRP1 PLAT LRP1 TRUE CLDN7 0.835 1 1 2.30
## 3 pEMT_L~ Malignan~ Endoth~ PLAT--LRP1 PLAT LRP1 TRUE ETV4 0.771 1 1 1.00
## 4 pEMT_L~ Malignan~ CAF_Low FGF10--FGFR2 FGF10 FGFR2 TRUE ETV4 0.771 1 1 1.00
## 5 pEMT_L~ Malignan~ CAF_Low FGF10--FGFR2 FGF10 FGFR2 TRUE WNT5A 1.40 1 1 2.01
## 6 pEMT_L~ Malignan~ CAF_Low NLGN2--NRXN3 NLGN2 NRXN3 TRUE CLDN5 0.979 1 1 0.991
## 7 pEMT_L~ Malignan~ CAF_Low NLGN2--NRXN3 NLGN2 NRXN3 TRUE ETV4 0.771 1 1 1.00
## 8 pEMT_L~ Malignan~ CAF_Low RSPO3--LGR6 RSPO3 LGR6 TRUE DDC 0.832 1 1 0.785
## 9 pEMT_L~ Malignan~ CAF_Low RSPO3--LGR6 RSPO3 LGR6 TRUE EGFL7 0.763 1 1 1.09
## 10 pEMT_L~ Malignan~ CAF_Low COMP--SDC1 COMP SDC1 TRUE CLDN7 0.835 1 1 2.30
## # ... with 8 more variables: target_expression_scaled <dbl>, target_fraction <dbl>, ligand_target_weight <dbl>, activity <dbl>,
## # activity_normalized <dbl>, scaled_activity <dbl>, scaled_activity_normalized <dbl>, prioritization_score <dbl>
8. Visualization of the Differential NicheNet output
Differential expression of ligand and expression
在可視化之前,需要定義每個(gè)生態(tài)位中最重要的配體-受體對(duì)旨巷。 將通過(guò)首先確定每個(gè)配體/配體-受體對(duì)在哪個(gè)生態(tài)位上找到最高分來(lái)做到這一點(diǎn)巨缘。 然后獲得每個(gè)niche的前 50 個(gè)配體。
top_ligand_niche_df = prioritization_tables$prioritization_tbl_ligand_receptor %>% select(niche, sender, receiver, ligand, receptor, prioritization_score) %>% group_by(ligand) %>% top_n(1, prioritization_score) %>% ungroup() %>% select(ligand, receptor, niche) %>% rename(top_niche = niche)
top_ligand_receptor_niche_df = prioritization_tables$prioritization_tbl_ligand_receptor %>% select(niche, sender, receiver, ligand, receptor, prioritization_score) %>% group_by(ligand, receptor) %>% top_n(1, prioritization_score) %>% ungroup() %>% select(ligand, receptor, niche) %>% rename(top_niche = niche)
ligand_prioritized_tbl_oi = prioritization_tables$prioritization_tbl_ligand_receptor %>% select(niche, sender, receiver, ligand, prioritization_score) %>% group_by(ligand, niche) %>% top_n(1, prioritization_score) %>% ungroup() %>% distinct() %>% inner_join(top_ligand_niche_df) %>% filter(niche == top_niche) %>% group_by(niche) %>% top_n(50, prioritization_score) %>% ungroup() # get the top50 ligands per niche
Now we will look first at the top ligand-receptor pairs for KCs (here, we will take the top 2 scoring receptors per prioritized ligand)
receiver_oi = "Malignant_High"
filtered_ligands = ligand_prioritized_tbl_oi %>% filter(receiver == receiver_oi) %>% pull(ligand) %>% unique()
prioritized_tbl_oi = prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(ligand %in% filtered_ligands) %>% select(niche, sender, receiver, ligand, receptor, ligand_receptor, prioritization_score) %>% distinct() %>% inner_join(top_ligand_receptor_niche_df) %>% group_by(ligand) %>% filter(receiver == receiver_oi) %>% top_n(2, prioritization_score) %>% ungroup()
lfc_plot = make_ligand_receptor_lfc_plot(receiver_oi, prioritized_tbl_oi, prioritization_tables$prioritization_tbl_ligand_receptor, plot_legend = FALSE, heights = NULL, widths = NULL)
lfc_plot
Show the spatialDE as additional information
lfc_plot_spatial = make_ligand_receptor_lfc_spatial_plot(receiver_oi, prioritized_tbl_oi, prioritization_tables$prioritization_tbl_ligand_receptor, ligand_spatial = include_spatial_info_sender, receptor_spatial = include_spatial_info_receiver, plot_legend = FALSE, heights = NULL, widths = NULL)
lfc_plot_spatial
Ligand expression, activity and target genes
Active target gene inference - cf Default NicheNet
Now: visualization of ligand activity and ligand-target links
exprs_activity_target_plot = make_ligand_activity_target_exprs_plot(receiver_oi, prioritized_tbl_oi, prioritization_tables$prioritization_tbl_ligand_receptor, prioritization_tables$prioritization_tbl_ligand_target, output$exprs_tbl_ligand, output$exprs_tbl_target, lfc_cutoff, ligand_target_matrix, plot_legend = FALSE, heights = NULL, widths = NULL)
exprs_activity_target_plot$combined_plot
基于此圖采呐,可以推斷出許多假設(shè)若锁,例如:“有趣的是,IL1 家族配體似乎具有誘導(dǎo)高 pEMT 和低 pEMT 惡性細(xì)胞之間的 DE 基因的活性懈万; 它們主要由骨髓細(xì)胞表達(dá)拴清,這是一種 pEMT 高腫瘤特有的細(xì)胞類(lèi)型”靶病。
重要:具有高差異表達(dá)(或條件特異性)和配體活性(=靶基因富集)的配體-受體對(duì)是非常有趣的預(yù)測(cè),作為您感興趣的細(xì)胞間通訊過(guò)程的關(guān)鍵調(diào)節(jié)因子口予!
如果該圖包含太多信息娄周,因?yàn)椴榭戳嗽S多匹配項(xiàng)(前 50 個(gè)配體),您當(dāng)然也可以為較少的配體制作此圖沪停,例如前 20 個(gè)配體煤辨。
filtered_ligands = ligand_prioritized_tbl_oi %>% filter(receiver == receiver_oi) %>% top_n(20, prioritization_score) %>% pull(ligand) %>% unique()
prioritized_tbl_oi = prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(ligand %in% filtered_ligands) %>% select(niche, sender, receiver, ligand, receptor, ligand_receptor, prioritization_score) %>% distinct() %>% inner_join(top_ligand_receptor_niche_df) %>% group_by(ligand) %>% filter(receiver == receiver_oi) %>% top_n(2, prioritization_score) %>% ungroup()
exprs_activity_target_plot = make_ligand_activity_target_exprs_plot(receiver_oi, prioritized_tbl_oi, prioritization_tables$prioritization_tbl_ligand_receptor, prioritization_tables$prioritization_tbl_ligand_target, output$exprs_tbl_ligand, output$exprs_tbl_target, lfc_cutoff, ligand_target_matrix, plot_legend = FALSE, heights = NULL, widths = NULL)
exprs_activity_target_plot$combined_plot
Circos plot of prioritized ligand-receptor pairs
Because a top50 is too much to visualize in a circos plot, we will only visualize the top 15.
filtered_ligands = ligand_prioritized_tbl_oi %>% filter(receiver == receiver_oi) %>% top_n(15, prioritization_score) %>% pull(ligand) %>% unique()
prioritized_tbl_oi = prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(ligand %in% filtered_ligands) %>% select(niche, sender, receiver, ligand, receptor, ligand_receptor, prioritization_score) %>% distinct() %>% inner_join(top_ligand_receptor_niche_df) %>% group_by(ligand) %>% filter(receiver == receiver_oi) %>% top_n(2, prioritization_score) %>% ungroup()
colors_sender = brewer.pal(n = prioritized_tbl_oi$sender %>% unique() %>% sort() %>% length(), name = 'Spectral') %>% magrittr::set_names(prioritized_tbl_oi$sender %>% unique() %>% sort())
colors_receiver = c("lavender") %>% magrittr::set_names(prioritized_tbl_oi$receiver %>% unique() %>% sort())
circos_output = make_circos_lr(prioritized_tbl_oi, colors_sender, colors_receiver)
Interpretation of these results
大多數(shù)排名靠前的差異 L-R 對(duì)似乎來(lái)自?xún)H存在于 pEMT 高腫瘤中的細(xì)胞類(lèi)型。這可能部分是由于生物學(xué)(一種情況下的獨(dú)特細(xì)胞類(lèi)型可能非常重要)木张,但也可能是由于優(yōu)先排序的方式以及這些獨(dú)特的細(xì)胞類(lèi)型在其中沒(méi)有“對(duì)應(yīng)物”的事實(shí)其他niche众辨。
因?yàn)楣撬杓?xì)胞和T細(xì)胞與腫瘤微環(huán)境中的其他細(xì)胞有很大不同,它們的配體會(huì)表現(xiàn)出強(qiáng)烈的差異表達(dá)舷礼。與來(lái)自相同細(xì)胞類(lèi)型但不同生態(tài)位/條件的細(xì)胞之間的差異表達(dá)(pEMT-high 中的 CAF 與 pEMT 中的 CAF pEMT-低)鹃彻。所以結(jié)論:Differential NicheNet 的一個(gè)優(yōu)勢(shì)在于它可以處理特定條件的細(xì)胞類(lèi)型,但用戶(hù)應(yīng)該知道最終的一般分?jǐn)?shù)可能會(huì)偏向于特定條件的發(fā)送者細(xì)胞類(lèi)型妻献。因此蛛株,如果有一個(gè)案例研究,其中某些發(fā)送細(xì)胞類(lèi)型是特定于條件的育拨,建議還查看每個(gè)發(fā)送細(xì)胞類(lèi)型的top LR 對(duì)谨履。
Visualization for the other condition: pEMT-low
receiver_oi = "Malignant_Low"
filtered_ligands = ligand_prioritized_tbl_oi %>% filter(receiver == receiver_oi) %>% top_n(50, prioritization_score) %>% pull(ligand) %>% unique()
prioritized_tbl_oi = prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(ligand %in% filtered_ligands) %>% select(niche, sender, receiver, ligand, receptor, ligand_receptor, prioritization_score) %>% distinct() %>% inner_join(top_ligand_receptor_niche_df) %>% group_by(ligand) %>% filter(receiver == receiver_oi) %>% top_n(2, prioritization_score) %>% ungroup()
lfc_plot = make_ligand_receptor_lfc_plot(receiver_oi, prioritized_tbl_oi, prioritization_tables$prioritization_tbl_ligand_receptor, plot_legend = FALSE, heights = NULL, widths = NULL)
lfc_plot
好了,已經(jīng)分享給大家了熬丧,生活很好笋粟,有你更好,百度文庫(kù)出現(xiàn)了大量抄襲我的文章析蝴,對(duì)此我深表無(wú)奈害捕,我寫(xiě)的文章,別人掛上去賺錢(qián)闷畸,抄襲可恥吨艇,掛到百度文庫(kù)的人更可恥