10X單細(xì)胞（10X空間轉(zhuǎn)錄組）通訊分析之總結(jié)Nichenet多條件差異通訊

作者舵盈，追風(fēng)少年i

周三了论矾，好好學(xué)習(xí)，天天搬磚沛简，人生沒(méi)有倒帶齐鲤，前進(jìn)才是真理

接上一篇，10X單細(xì)胞（10X空間轉(zhuǎn)錄組）通訊分析之總結(jié)Nichenet生態(tài)位通訊比較分析椒楣，不同條件下樣本之間的差異通訊给郊，其實(shí)關(guān)于通訊最開(kāi)始的分析是細(xì)胞類(lèi)型之間，后來(lái)慢慢轉(zhuǎn)移到相對(duì)于normal捧灰，疾病之間通訊的變化淆九。

Differential NicheNet 的目標(biāo)是預(yù)測(cè)配體-受體對(duì)，它們?cè)诓煌母信d趣的生態(tài)位之間都有差異表達(dá)和活躍。關(guān)于生態(tài)位炭庙，請(qǐng)參考10X單細(xì)胞（10X空間轉(zhuǎn)錄組）通訊分析之總結(jié)Nichenet生態(tài)位通訊比較分析跪另。

0. Read in the expression data of interest, and the NicheNet ligand-receptor network and ligand-target matrix

Load in packages

library(nichenetr)
library(RColorBrewer)
library(tidyverse)
library(Seurat)

讀取表達(dá)數(shù)據(jù)，大家自己讀取分析的rds即可

seurat_obj = readRDS(url("https://zenodo.org/record/4675430/files/seurat_obj_hnscc.rds"))
DimPlot(seurat_obj, group.by = "celltype") # user adaptation required on own dataset

圖片.png

DimPlot(seurat_obj, group.by = "pEMT") # user adaptation required on own dataset

圖片.png

檢查細(xì)胞類(lèi)型的分布

table(seurat_obj@meta.data$celltype, seurat_obj@meta.data$pEMT) # cell types vs conditions # user adaptation required on own dataset
##                
##                 High  Low
##   CAF            396  104
##   Endothelial    105   53
##   Malignant     1093  549
##   Myeloid         92    7
##   myofibroblast  382   61
##   T.cell         689    3

對(duì)于Differential NicheNet煤搜，需要相互比較至少2 個(gè)生態(tài)位或條件免绿。在這種情況下，兩個(gè)生態(tài)位是 pEMT-high-niche 和 pEMT-low-niche擦盾。 fenxi 將根據(jù)細(xì)胞類(lèi)型的樣本分布調(diào)整它們的名稱(chēng)嘲驾。

seurat_obj@meta.data$celltype_aggregate = paste(seurat_obj@meta.data$celltype, seurat_obj@meta.data$pEMT,sep = "_") # user adaptation required on own dataset
DimPlot(seurat_obj, group.by = "celltype_aggregate")

圖片.png

seurat_obj@meta.data$celltype_aggregate %>% table() %>% sort(decreasing = TRUE)
## .
##     Malignant_High        T.cell_High      Malignant_Low           CAF_High myofibroblast_High   Endothelial_High            CAF_Low 
##               1093                689                549                396                382                105                104 
##       Myeloid_High  myofibroblast_Low    Endothelial_Low        Myeloid_Low         T.cell_Low 
##                 92                 61                 53                  7                  3

celltype_id = "celltype_aggregate" # metadata column name of the cell type of interest
seurat_obj = SetIdent(seurat_obj, value = seurat_obj[[celltype_id]])

Read in the NicheNet ligand-receptor network and ligand-target matrix(讀取數(shù)據(jù)庫(kù))

ligand_target_matrix = readRDS(url("https://zenodo.org/record/3260758/files/ligand_target_matrix.rds"))
ligand_target_matrix[1:5,1:5] # target genes in rows, ligands in columns
##                 CXCL1        CXCL2        CXCL3        CXCL5         PPBP
## A1BG     3.534343e-04 4.041324e-04 3.729920e-04 3.080640e-04 2.628388e-04
## A1BG-AS1 1.650894e-04 1.509213e-04 1.583594e-04 1.317253e-04 1.231819e-04
## A1CF     5.787175e-04 4.596295e-04 3.895907e-04 3.293275e-04 3.211944e-04
## A2M      6.027058e-04 5.996617e-04 5.164365e-04 4.517236e-04 4.590521e-04
## A2M-AS1  8.898724e-05 8.243341e-05 7.484018e-05 4.912514e-05 5.120439e-05

lr_network = readRDS(url("https://zenodo.org/record/3260758/files/lr_network.rds"))
lr_network = lr_network %>% mutate(bonafide = ! database %in% c("ppi_prediction","ppi_prediction_go"))
lr_network = lr_network %>% dplyr::rename(ligand = from, receptor = to) %>% distinct(ligand, receptor, bonafide)

head(lr_network)
## # A tibble: 6 x 3
##   ligand receptor bonafide
##   <chr>  <chr>    <lgl>   
## 1 CXCL1  CXCR2    TRUE    
## 2 CXCL2  CXCR2    TRUE    
## 3 CXCL3  CXCR2    TRUE    
## 4 CXCL5  CXCR2    TRUE    
## 5 PPBP   CXCR2    TRUE    
## 6 CXCL6  CXCR2    TRUE

人鼠基因轉(zhuǎn)換

organism = "human" # user adaptation required on own dataset

if(organism == "mouse"){
  lr_network = lr_network %>% mutate(ligand = convert_human_to_mouse_symbols(ligand), receptor = convert_human_to_mouse_symbols(receptor)) %>% drop_na()

  colnames(ligand_target_matrix) = ligand_target_matrix %>% colnames() %>% convert_human_to_mouse_symbols()
  rownames(ligand_target_matrix) = ligand_target_matrix %>% rownames() %>% convert_human_to_mouse_symbols()
  ligand_target_matrix = ligand_target_matrix %>% .[!is.na(rownames(ligand_target_matrix)), !is.na(colnames(ligand_target_matrix))]
}

1、Define the niches/microenvironments of interest

每個(gè)生態(tài)位應(yīng)至少有一個(gè)“發(fā)送者/生態(tài)位”細(xì)胞群和一個(gè)“接收者/目標(biāo)”細(xì)胞群

在本案例研究中迹卢，希望發(fā)現(xiàn) pEMT 高和 pEMT 低腫瘤之間細(xì)胞間相互作用與惡性細(xì)胞的差異辽故。因此，pEMT-High 生態(tài)位中的接收細(xì)胞群是“Malignant_High”細(xì)胞類(lèi)型腐碱，而在 pEMT-Low 生態(tài)位中誊垢，這是“Malignant_Low”。感興趣的發(fā)送細(xì)胞群是肌成纖維細(xì)胞症见、內(nèi)皮細(xì)胞喂走、CAF、T 細(xì)胞和骨髓細(xì)胞谋作。重要的是芋肠，只在 pEMT-High 生態(tài)位中包括 T.Cell 和 Myeloid，因?yàn)檫@些群體的細(xì)胞太少存在于 pEMT-low 生態(tài)位中遵蚜。因此帖池，證明了在分析中包括特定條件細(xì)胞類(lèi)型的可能性——這是可能的，因?yàn)橛?jì)算了與其他生態(tài)位的所有發(fā)送細(xì)胞相比的 DE吭净，而不僅僅是同一細(xì)胞的 pEMT-low 細(xì)胞類(lèi)型睡汹。

niches = list(
  "pEMT_High_niche" = list(
    "sender" = c("myofibroblast_High", "Endothelial_High", "CAF_High", "T.cell_High", "Myeloid_High"),
    "receiver" = c("Malignant_High")),
  "pEMT_Low_niche" = list(
    "sender" = c("myofibroblast_Low",  "Endothelial_Low", "CAF_Low"),
    "receiver" = c("Malignant_Low"))
  ) # user adaptation required on own dataset

2. Calculate differential expression between the niches

在這一步中，將為發(fā)送者和接收者確定不同生態(tài)位之間的 DE寂殉，以定義 L-R 對(duì)的 DE囚巴。

Calculate DE

計(jì)算 differential expression的方法在這里是標(biāo)準(zhǔn)的 Seurat Wilcoxon 檢驗(yàn)，但如果需要不撑，可以將其替換（唯一要求：輸出表 DE_sender_processed 和 DE_receiver_processed 應(yīng)采用與此處所示相同的格式）文兢。

將為每個(gè)成對(duì)的發(fā)送者（或接收者）細(xì)胞類(lèi)型在生態(tài)位之間的比較計(jì)算 DE（因此跨生態(tài)位，而不是在生態(tài)位內(nèi)）焕檬。在分析的案例研究中姆坚，這意味著 myofibroblast_High 配體的 DE 將通過(guò) myofibroblast_High 與 myofibroblast_Low 的 DE 分析來(lái)計(jì)算；肌成纖維細(xì)胞高 vs 內(nèi)皮低实愚；和肌成纖維細(xì)胞高 vs CAF低兼呵。根據(jù)細(xì)胞類(lèi)型拆分細(xì)胞兔辅，而不是合并來(lái)自其他生態(tài)位的所有細(xì)胞，以避免 DE 分析由最豐富的細(xì)胞類(lèi)型驅(qū)動(dòng)（和上一篇差異巨大的地方）击喂。

assay_oi = "SCT" # other possibilities: RNA,...
DE_sender = calculate_niche_de(seurat_obj = seurat_obj %>% subset(features = lr_network$ligand %>% unique()), niches = niches, type = "sender", assay_oi = assay_oi) # only ligands important for sender cell types
## [1] "Calculate Sender DE between: myofibroblast_High and myofibroblast_Low"
## [2] "Calculate Sender DE between: myofibroblast_High and Endothelial_Low"  
## [3] "Calculate Sender DE between: myofibroblast_High and CAF_Low"          
## [1] "Calculate Sender DE between: Endothelial_High and myofibroblast_Low"
## [2] "Calculate Sender DE between: Endothelial_High and Endothelial_Low"  
## [3] "Calculate Sender DE between: Endothelial_High and CAF_Low"          
## [1] "Calculate Sender DE between: CAF_High and myofibroblast_Low" "Calculate Sender DE between: CAF_High and Endothelial_Low"  
## [3] "Calculate Sender DE between: CAF_High and CAF_Low"          
## [1] "Calculate Sender DE between: T.cell_High and myofibroblast_Low" "Calculate Sender DE between: T.cell_High and Endothelial_Low"  
## [3] "Calculate Sender DE between: T.cell_High and CAF_Low"          
## [1] "Calculate Sender DE between: Myeloid_High and myofibroblast_Low" "Calculate Sender DE between: Myeloid_High and Endothelial_Low"  
## [3] "Calculate Sender DE between: Myeloid_High and CAF_Low"          
## [1] "Calculate Sender DE between: myofibroblast_Low and myofibroblast_High"
## [2] "Calculate Sender DE between: myofibroblast_Low and Endothelial_High"  
## [3] "Calculate Sender DE between: myofibroblast_Low and CAF_High"          
## [4] "Calculate Sender DE between: myofibroblast_Low and T.cell_High"       
## [5] "Calculate Sender DE between: myofibroblast_Low and Myeloid_High"      
## [1] "Calculate Sender DE between: Endothelial_Low and myofibroblast_High"
## [2] "Calculate Sender DE between: Endothelial_Low and Endothelial_High"  
## [3] "Calculate Sender DE between: Endothelial_Low and CAF_High"          
## [4] "Calculate Sender DE between: Endothelial_Low and T.cell_High"       
## [5] "Calculate Sender DE between: Endothelial_Low and Myeloid_High"      
## [1] "Calculate Sender DE between: CAF_Low and myofibroblast_High" "Calculate Sender DE between: CAF_Low and Endothelial_High"  
## [3] "Calculate Sender DE between: CAF_Low and CAF_High"           "Calculate Sender DE between: CAF_Low and T.cell_High"       
## [5] "Calculate Sender DE between: CAF_Low and Myeloid_High"
DE_receiver = calculate_niche_de(seurat_obj = seurat_obj %>% subset(features = lr_network$receptor %>% unique()), niches = niches, type = "receiver", assay_oi = assay_oi) # only receptors now, later on: DE analysis to find targets
## # A tibble: 1 x 2
##   receiver       receiver_other_niche
##   <chr>          <chr>               
## 1 Malignant_High Malignant_Low       
## [1] "Calculate receiver DE between: Malignant_High and Malignant_Low"
## [1] "Calculate receiver DE between: Malignant_Low and Malignant_High"
DE_sender = DE_sender %>% mutate(avg_log2FC = ifelse(avg_log2FC == Inf, max(avg_log2FC[is.finite(avg_log2FC)]), ifelse(avg_log2FC == -Inf, min(avg_log2FC[is.finite(avg_log2FC)]), avg_log2FC)))
DE_receiver = DE_receiver %>% mutate(avg_log2FC = ifelse(avg_log2FC == Inf, max(avg_log2FC[is.finite(avg_log2FC)]), ifelse(avg_log2FC == -Inf, min(avg_log2FC[is.finite(avg_log2FC)]), avg_log2FC)))

Process DE results:

expression_pct = 0.10
DE_sender_processed = process_niche_de(DE_table = DE_sender, niches = niches, expression_pct = expression_pct, type = "sender")
DE_receiver_processed = process_niche_de(DE_table = DE_receiver, niches = niches, expression_pct = expression_pct, type = "receiver")

Combine sender-receiver DE based on L-R pairs:

如上所述维苔，來(lái)自一種配體細(xì)胞類(lèi)型的配體的 DE 確定為計(jì)算該細(xì)胞類(lèi)型與其他生態(tài)位的所有配體細(xì)胞類(lèi)型之間的 DE。為了總結(jié)該細(xì)胞類(lèi)型配體的 DE懂昂，有幾個(gè)選擇：可以采用平均 LFC介时，但也可以采用與其他生態(tài)位相比的最小 LFC。建議使用最小 LFC凌彬，因?yàn)檫@是對(duì)配體表達(dá)的最強(qiáng)特異性測(cè)量沸柔，因?yàn)楦?min LFC 意味著與所有生態(tài)位 2 的細(xì)胞類(lèi)型相比，配體在生態(tài)位 1 的細(xì)胞類(lèi)型中的表達(dá)更強(qiáng)（與高平均 LFC铲敛，不排除生態(tài)位 2 中的一種或多種細(xì)胞類(lèi)型也強(qiáng)烈表達(dá)該配體）褐澎。

specificity_score_LR_pairs = "min_lfc"
DE_sender_receiver = combine_sender_receiver_de(DE_sender_processed, DE_receiver_processed, lr_network, specificity_score = specificity_score_LR_pairs)

3. Optional: Calculate differential expression between the different spatial regions(針對(duì)有空間的數(shù)據(jù))

為了改進(jìn)細(xì)胞-細(xì)胞相互作用預(yù)測(cè)，如果可能且適用伐蒋，可以考慮空間信息工三。空間信息可以來(lái)自顯微鏡數(shù)據(jù)先鱼，也可以來(lái)自空間轉(zhuǎn)錄組學(xué)數(shù)據(jù)俭正，例如 Visium。

有幾種方法可以將空間信息合并到差分 NicheNet pipeline中型型。首先段审，如果細(xì)胞類(lèi)型位于相同的空間位置，則只能將它們視為屬于同一生態(tài)位闹蒜。另一種方法是在優(yōu)先框架中包括一種細(xì)胞類(lèi)型內(nèi)配體-受體對(duì)的空間差異表達(dá)。

例如：有一個(gè)細(xì)胞類(lèi)型 X抑淫，位于區(qū)域 A 和 B绷落，想研究區(qū)域 A 的細(xì)胞間通信。首先在生態(tài)位定義中只添加區(qū)域 A 的 celltypeX始苇，然后計(jì)算 celltypeX-regionA 之間的 DE和 celltypeX-regionB 為 regionA 特異性配體提供更高的優(yōu)先權(quán)砌烁。

在這個(gè)案例研究中，感興趣的區(qū)域是腫瘤前沿催式，因?yàn)閷⒃搮^(qū)域定義為對(duì) pEMT 過(guò)程很重要函喉。還將 CAF 定義為靠近前緣的成纖維細(xì)胞，而其他成纖維細(xì)胞（肌成纖維細(xì)胞）不優(yōu)先位于腫瘤前緣荣月。因此管呵，現(xiàn)在可以通過(guò)查看前沿成纖維細(xì)胞 (=CAF) 和非前沿成纖維細(xì)胞 (肌成纖維細(xì)胞) 之間的 DE 配體來(lái)進(jìn)一步優(yōu)先考慮成纖維細(xì)胞配體。

We do this as follows, by first defining a ‘spatial info’ dataframe. If no spatial information in your data: set the following two parameters to FALSE, and make a mock ‘spatial_info’ data frame.

include_spatial_info_sender = TRUE # if not spatial info to include: put this to false # user adaptation required on own dataset
include_spatial_info_receiver = FALSE # if spatial info to include: put this to true # user adaptation required on own dataset

spatial_info = tibble(celltype_region_oi = "CAF_High", celltype_other_region = "myofibroblast_High", niche =  "pEMT_High_niche", celltype_type = "sender") # user adaptation required on own dataset
specificity_score_spatial = "lfc"

# this is how this should be defined if you don't have spatial info
# mock spatial info
if(include_spatial_info_sender == FALSE & include_spatial_info_receiver == FALSE){
    spatial_info = tibble(celltype_region_oi = NA, celltype_other_region = NA) %>% mutate(niche =  niches %>% names() %>% head(1), celltype_type = "sender")
}

if(include_spatial_info_sender == TRUE){
  sender_spatial_DE = calculate_spatial_DE(seurat_obj = seurat_obj %>% subset(features = lr_network$ligand %>% unique()), spatial_info = spatial_info %>% filter(celltype_type == "sender"))
  sender_spatial_DE_processed = process_spatial_de(DE_table = sender_spatial_DE, type = "sender", lr_network = lr_network, expression_pct = expression_pct, specificity_score = specificity_score_spatial)

  # add a neutral spatial score for sender celltypes in which the spatial is not known / not of importance
  sender_spatial_DE_others = get_non_spatial_de(niches = niches, spatial_info = spatial_info, type = "sender", lr_network = lr_network)
  sender_spatial_DE_processed = sender_spatial_DE_processed %>% bind_rows(sender_spatial_DE_others)

  sender_spatial_DE_processed = sender_spatial_DE_processed %>% mutate(scaled_ligand_score_spatial = scale_quantile_adapted(ligand_score_spatial))

} else {
  # # add a neutral spatial score for all sender celltypes (for none of them, spatial is relevant in this case)
  sender_spatial_DE_processed = get_non_spatial_de(niches = niches, spatial_info = spatial_info, type = "sender", lr_network = lr_network)
  sender_spatial_DE_processed = sender_spatial_DE_processed %>% mutate(scaled_ligand_score_spatial = scale_quantile_adapted(ligand_score_spatial))  

}
## [1] "Calculate Spatial DE between: CAF_High and myofibroblast_High"

if(include_spatial_info_receiver == TRUE){
  receiver_spatial_DE = calculate_spatial_DE(seurat_obj = seurat_obj %>% subset(features = lr_network$receptor %>% unique()), spatial_info = spatial_info %>% filter(celltype_type == "receiver"))
  receiver_spatial_DE_processed = process_spatial_de(DE_table = receiver_spatial_DE, type = "receiver", lr_network = lr_network, expression_pct = expression_pct, specificity_score = specificity_score_spatial)

  # add a neutral spatial score for receiver celltypes in which the spatial is not known / not of importance
  receiver_spatial_DE_others = get_non_spatial_de(niches = niches, spatial_info = spatial_info, type = "receiver", lr_network = lr_network)
  receiver_spatial_DE_processed = receiver_spatial_DE_processed %>% bind_rows(receiver_spatial_DE_others)

  receiver_spatial_DE_processed = receiver_spatial_DE_processed %>% mutate(scaled_receptor_score_spatial = scale_quantile_adapted(receptor_score_spatial))

} else {
    # # add a neutral spatial score for all receiver celltypes (for none of them, spatial is relevant in this case)
  receiver_spatial_DE_processed = get_non_spatial_de(niches = niches, spatial_info = spatial_info, type = "receiver", lr_network = lr_network)
  receiver_spatial_DE_processed = receiver_spatial_DE_processed %>% mutate(scaled_receptor_score_spatial = scale_quantile_adapted(receptor_score_spatial))
}

4. Calculate ligand activities and infer active ligand-target links

在這一步中哺窄，將預(yù)測(cè)每個(gè)配體在不同生態(tài)位中的每種受體細(xì)胞類(lèi)型的配體活性捐下。這類(lèi)似于在正常 NicheNet pipeline中進(jìn)行的配體活性分析账锹。

為了計(jì)算配體活性，首先需要為每個(gè)生態(tài)位定義一個(gè)感興趣的geneset坷襟。在這個(gè)案例研究中奸柬，與 pEMT 低腫瘤相比，pEMT 高生態(tài)位的geneset是 pEMT 高腫瘤中上調(diào)的基因婴程，反之亦然廓奕。

還可以以不同的方式定義這些感興趣的基因集！

lfc_cutoff = 0.15 # recommended for 10x as min_lfc cutoff. 
specificity_score_targets = "min_lfc"

DE_receiver_targets = calculate_niche_de_targets(seurat_obj = seurat_obj, niches = niches, lfc_cutoff = lfc_cutoff, expression_pct = expression_pct, assay_oi = assay_oi) 
## [1] "Calculate receiver DE between: Malignant_High and Malignant_Low"
## [1] "Calculate receiver DE between: Malignant_Low and Malignant_High"
DE_receiver_processed_targets = process_receiver_target_de(DE_receiver_targets = DE_receiver_targets, niches = niches, expression_pct = expression_pct, specificity_score = specificity_score_targets)

background = DE_receiver_processed_targets  %>% pull(target) %>% unique()
geneset_niche1 = DE_receiver_processed_targets %>% filter(receiver == niches[[1]]$receiver & target_score >= lfc_cutoff & target_significant == 1 & target_present == 1) %>% pull(target) %>% unique()
geneset_niche2 = DE_receiver_processed_targets %>% filter(receiver == niches[[2]]$receiver & target_score >= lfc_cutoff & target_significant == 1 & target_present == 1) %>% pull(target) %>% unique()
  
# Good idea to check which genes will be left out of the ligand activity analysis (=when not present in the rownames of the ligand-target matrix).
# If many genes are left out, this might point to some issue in the gene naming (eg gene aliases and old gene symbols, bad human-mouse mapping)
geneset_niche1 %>% setdiff(rownames(ligand_target_matrix))
##  [1] "ANXA8L2"       "PRKCDBP"       "IL8"           "PTRF"          "SEPP1"         "C1orf186"      "CCDC109B"      "C10orf54"     
##  [9] "LEPREL1"       "ZNF812"        "LOC645638"     "LOC401397"     "LINC00162"     "DFNA5"         "PLK1S1"        "ZMYM6NB"      
## [17] "C19orf10"      "CTSL1"         "SQRDL"         "LOC375295"     "WBP5"          "LOC100505633"  "AIM1"          "C1orf63"      
## [25] "LOC100507463"  "GPR115"        "VIMP"          "SEP15"         "C1orf172"      "NAPRT1"        "LHFP"          "KRT16P1"      
## [33] "C7orf10"       "PTPLA"         "GRAMD3"        "CPSF3L"        "MESDC2"        "C10orf10"      "KIAA1609"      "CCDC53"       
## [41] "TXLNG2P"       "NGFRAP1"       "ERO1L"         "FAM134A"       "LSMD1"         "TCEB2"         "B3GALTL"       "HN1L"         
## [49] "LOC550643"     "KIAA0922"      "GLT25D1"       "FAM127A"       "C1orf151-NBL1" "SEPW1"         "GPR126"        "LOC100505806" 
## [57] "LINC00478"     "TCEB1"         "GRAMD2"        "GNB2L1"        "KIRREL"
geneset_niche2 %>% setdiff(rownames(ligand_target_matrix))
##   [1] "LOC344887"    "AGPAT9"       "C1orf110"     "KIAA1467"     "LOC100292680" "EPT1"         "CT45A4"       "LOC654433"   
##   [9] "UPK3BL"       "LINC00340"    "LOC100128338" "FAM60A"       "CCDC144C"     "LOC401109"    "LOC286467"    "LEPREL4"     
##  [17] "LOC731275"    "LOC642236"    "LINC00516"    "LOC101101776" "SC5DL"        "PVRL4"        "LOC100130093" "LINC00338"   
##  [25] "LOC100132891" "PPAP2C"       "C6orf1"       "C2orf47"      "WHSC1L1"      "LOC100289019" "SETD8"        "KDM5B-AS1"   
##  [33] "SPG20"        "CXCR7"        "LOC100216479" "LOC100505761" "MGC57346"     "LPHN3"        "CENPC1"       "C11orf93"    
##  [41] "C14orf169"    "LOC100506060" "FLJ31485"     "LOC440905"    "MLF1IP"       "TMEM194A"     "RRP7B"        "REXO1L1"     
##  [49] "LOC100129269" "KIAA1715"     "CTAGE5"       "LOC202781"    "LOC100506714" "LOC401164"    "UTS2D"        "LOC146880"   
##  [57] "KIAA1804"     "C5orf55"      "C21orf119"    "PRUNE"        "LRRC16A"      "LOC339240"    "FLJ35024"     "C5orf28"     
##  [65] "LOC100505876" "MGC21881"     "LOC100133985" "PPAPDC2"      "FRG1B"        "CECR5"        "LOC100129361" "CCBL1"       
##  [73] "PTPLAD1"      "MST4"         "LOC550112"    "LOC389791"    "CCDC90A"      "KIAA0195"     "LOC100506469" "LOC100133161"
##  [81] "LOC646719"    "LOC728819"    "BRE"          "LOC284581"    "LOC441081"    "LOC728377"    "LOC100134229" "C3orf65"     
##  [89] "SMEK2"        "KIAA1737"     "C17orf70"     "PLEKHM1P"     "LOC338758"    "PCNXL2"       "LOC91948"     "C17orf89"    
##  [97] "LOC100505783" "SMCR7L"       "C8orf4"       "GPR56"        "ATHL1"        "LOC339535"    "PPAPDC1B"     "DAK"         
## [105] "LOC100507173" "CRHR1-IT1"    "PPAP2B"       "ADCK4"        "KIAA0146"     "GYLTL1B"      "LOC100272216" "LOC400027"   
## [113] "WHSC1"        "LOC100130855" "C7orf55"      "C19orf40"     "ADCK3"        "C9orf142"     "SGOL1"        "LOC90834"    
## [121] "PTPLAD2"      "KIAA1967"     "LOC100132352" "LOC100630918" "ADRBK2"       "LINC00263"    "FAM64A"       "LOC401074"   
## [129] "FAM179B"      "RP1-177G6.2"  "METTL21D"     "ERO1LB"       "FLJ45445"     "NADKD1"       "LOC100506233" "LOC100652772"
## [137] "FAM175A"      "LINC00630"    "C11orf82"     "SETD5-AS1"    "SGK196"       "FLJ14186"     "CCDC104"      "FAM63A"      
## [145] "NARG2"        "MTERFD1"      "CCDC74B-AS1"  "LOC286186"    "WDR67"        "C12orf52"     "FLJ30403"     "KIAA2018"    
## [153] "GCN1L1"       "FLJ43681"     "LOC152217"    "FONG"         "C18orf8"      "ALG1L9P"      "GTDC2"        "LOC100507217"
## [161] "NBPF24"       "WBSCR27"      "C14orf1"      "LOC284889"    "KIAA0317"     "FAM65A"       "PMS2L2"       "LUST"        
## [169] "C15orf52"     "FAM195A"      "LOC399744"    "PYCRL"        "LOC338799"    "LOC100506190" "C9orf91"      "FLJ45340"    
## [177] "LOC349196"    "LOC100128881" "TOMM70A"      "ALS2CR8"      "LDOC1L"       "HDGFRP3"      "ZNF767"       "LOC728558"   
## [185] "LOC283693"    "LEPREL2"      "QTRTD1"       "SELM"         "C6orf25"      "C1orf86"      "HNRPLL"       "LOC145820"   
## [193] "LOC100289341" "C17orf85"     "C3orf72"      "C14orf64"     "C9orf9"       "LOC100506394"

length(geneset_niche1)
## [1] 1668
length(geneset_niche2)
## [1] 2889

在進(jìn)行配體活性分析之前檢查geneset中的基因數(shù)量档叔。 建議在感興趣的基因集中包含 20 到 1000 個(gè)基因桌粉，以及至少 5000 個(gè)基因的背景，以便進(jìn)行適當(dāng)?shù)呐潴w活性分析蹲蒲。如果檢索到的 DE 基因過(guò)多番甩，建議使用更高的 lfc_cutoff 閾值。 如果有 > 2 個(gè)接收細(xì)胞/niche來(lái)比較并使用 min_lfc 作為特異性得分届搁，建議使用 0.15 的截止值缘薛。如果只有 2 個(gè)receiver/niche，建議使用更高的閾值（例如使用 0.25）卡睦。如果有 Smart-seq2 等具有高測(cè)序深度的單細(xì)胞數(shù)據(jù)宴胧，建議也使用更高的閾值。

As we see here, we have Smart-seq2 data and only 2 niches to compare, so we will use a stronger LFC threshold to keep less DE genes, but more trustworthy ones.

lfc_cutoff = 0.75 

specificity_score_targets = "min_lfc"

DE_receiver_processed_targets = process_receiver_target_de(DE_receiver_targets = DE_receiver_targets, niches = niches, expression_pct = expression_pct, specificity_score = specificity_score_targets)
  
background = DE_receiver_processed_targets  %>% pull(target) %>% unique()
geneset_niche1 = DE_receiver_processed_targets %>% filter(receiver == niches[[1]]$receiver & target_score >= lfc_cutoff & target_significant == 1 & target_present == 1) %>% pull(target) %>% unique()
geneset_niche2 = DE_receiver_processed_targets %>% filter(receiver == niches[[2]]$receiver & target_score >= lfc_cutoff & target_significant == 1 & target_present == 1) %>% pull(target) %>% unique()
  
# Good idea to check which genes will be left out of the ligand activity analysis (=when not present in the rownames of the ligand-target matrix).
# If many genes are left out, this might point to some issue in the gene naming (eg gene aliases and old gene symbols, bad human-mouse mapping)
geneset_niche1 %>% setdiff(rownames(ligand_target_matrix))
## [1] "ANXA8L2"  "PRKCDBP"  "IL8"      "PTRF"     "SEPP1"    "C1orf186"
geneset_niche2 %>% setdiff(rownames(ligand_target_matrix))
## [1] "LOC344887"    "AGPAT9"       "C1orf110"     "KIAA1467"     "LOC100292680" "EPT1"         "CT45A4"

length(geneset_niche1)
## [1] 169
length(geneset_niche2)
## [1] 136

top_n_target = 250

niche_geneset_list = list(
  "pEMT_High_niche" = list(
    "receiver" = niches[[1]]$receiver,
    "geneset" = geneset_niche1,
    "background" = background),
  "pEMT_Low_niche" = list(
    "receiver" = niches[[2]]$receiver,
    "geneset" = geneset_niche2 ,
    "background" = background)
  )
  
ligand_activities_targets = get_ligand_activities_targets(niche_geneset_list = niche_geneset_list, ligand_target_matrix = ligand_target_matrix, top_n_target = top_n_target)
## [1] "Calculate Ligand activities for: Malignant_High"
## [1] "Calculate Ligand activities for: Malignant_Low"

5. Calculate (scaled) expression of ligands, receptors and targets across cell types of interest (log expression values and expression fractions)

在這一步中表锻，將計(jì)算所有感興趣的細(xì)胞類(lèi)型的配體恕齐、受體和靶基因的平均（縮放）表達(dá)和表達(dá)分?jǐn)?shù)。現(xiàn)在瞬逊，這通過(guò) Seurat 的 DotPlot 函數(shù)進(jìn)行了演示显歧，但當(dāng)然也可以通過(guò)其他方式完成。

features_oi = union(lr_network$ligand, lr_network$receptor) %>% union(ligand_activities_targets$target) %>% setdiff(NA)
  
dotplot = suppressWarnings(Seurat::DotPlot(seurat_obj %>% subset(idents = niches %>% unlist() %>% unique()), features = features_oi, assay = assay_oi))
exprs_tbl = dotplot$data %>% as_tibble()
exprs_tbl = exprs_tbl %>% rename(celltype = id, gene = features.plot, expression = avg.exp, expression_scaled = avg.exp.scaled, fraction = pct.exp) %>%
    mutate(fraction = fraction/100) %>% as_tibble() %>% select(celltype, gene, expression, expression_scaled, fraction) %>% distinct() %>% arrange(gene) %>% mutate(gene = as.character(gene))
  
exprs_tbl_ligand = exprs_tbl %>% filter(gene %in% lr_network$ligand) %>% rename(sender = celltype, ligand = gene, ligand_expression = expression, ligand_expression_scaled = expression_scaled, ligand_fraction = fraction) 
exprs_tbl_receptor = exprs_tbl %>% filter(gene %in% lr_network$receptor) %>% rename(receiver = celltype, receptor = gene, receptor_expression = expression, receptor_expression_scaled = expression_scaled, receptor_fraction = fraction)
exprs_tbl_target = exprs_tbl %>% filter(gene %in% ligand_activities_targets$target) %>% rename(receiver = celltype, target = gene, target_expression = expression, target_expression_scaled = expression_scaled, target_fraction = fraction)

exprs_tbl_ligand = exprs_tbl_ligand %>%  mutate(scaled_ligand_expression_scaled = scale_quantile_adapted(ligand_expression_scaled)) %>% mutate(ligand_fraction_adapted = ligand_fraction) %>% mutate_cond(ligand_fraction >= expression_pct, ligand_fraction_adapted = expression_pct)  %>% mutate(scaled_ligand_fraction_adapted = scale_quantile_adapted(ligand_fraction_adapted))

exprs_tbl_receptor = exprs_tbl_receptor %>% mutate(scaled_receptor_expression_scaled = scale_quantile_adapted(receptor_expression_scaled))  %>% mutate(receptor_fraction_adapted = receptor_fraction) %>% mutate_cond(receptor_fraction >= expression_pct, receptor_fraction_adapted = expression_pct)  %>% mutate(scaled_receptor_fraction_adapted = scale_quantile_adapted(receptor_fraction_adapted))

6. Expression fraction and receptor

在這一步中确镊，將根據(jù)受體的表達(dá)強(qiáng)度對(duì)配體-受體相互作用進(jìn)行評(píng)分士骤，這樣就可以對(duì)特定細(xì)胞類(lèi)型中特定配體的最強(qiáng)表達(dá)受體給予更高的分?jǐn)?shù)。這不會(huì)影響以后單個(gè)配體的排名蕾域，但將有助于優(yōu)先考慮每個(gè)配體最重要的受體拷肌。

exprs_sender_receiver = lr_network %>% 
  inner_join(exprs_tbl_ligand, by = c("ligand")) %>% 
  inner_join(exprs_tbl_receptor, by = c("receptor")) %>% inner_join(DE_sender_receiver %>% distinct(niche, sender, receiver))
  
ligand_scaled_receptor_expression_fraction_df = exprs_sender_receiver %>% group_by(ligand, receiver) %>% mutate(rank_receptor_expression = dense_rank(receptor_expression), rank_receptor_fraction  = dense_rank(receptor_fraction)) %>% mutate(ligand_scaled_receptor_expression_fraction = 0.5*( (rank_receptor_fraction / max(rank_receptor_fraction)) + ((rank_receptor_expression / max(rank_receptor_expression))) ) )  %>% distinct(ligand, receptor, receiver, ligand_scaled_receptor_expression_fraction, bonafide) %>% distinct() %>% ungroup()

7. Prioritization of ligand-receptor and ligand-target links

In this step, we will combine all the above calculated information to prioritize ligand-receptor-target links. We scale every property of interest between 0 and 1, and the final prioritization score is a weighted sum of the scaled scores of all the properties of interest.

We provide the user the option to consider the following properties for prioritization (of which the weights are defined in prioritizing_weights) :

Ligand DE score: niche-specific expression of the ligand: by default, this the minimum logFC between the sender of interest and all the senders of the other niche(s). The higher the min logFC, the higher the niche-specificity of the ligand. Therefore we recommend to give this factor a very high weight. prioritizing_weights argument: "scaled_ligand_score". Recommended weight: 5 (at least 1, max 5).
Scaled ligand expression: scaled expression of a ligand in one sender compared to the other cell types in the dataset. This might be useful to rescue potentially interesting ligands that have a high scaled expression value, but a relatively small min logFC compared to the other niche. One reason why this logFC might be small occurs when (some) genes are not picked up efficiently by the used sequencing technology (or other reasons for low RNA expression of ligands). For example, we have observed that many ligands from the Tgf-beta/BMP family are not picked up efficiently with single-nuclei RNA sequencing compared to single-cell sequencing. prioritizing_weights argument: "scaled_ligand_expression_scaled". Recommended weight: 1 (unless technical reason for lower gene detection such as while using Nuc-seq: then recommended to use a higher weight: 2).
Ligand expression fraction: Ligands that are expressed in a smaller fraction of cells of a cell type than defined by exprs_cutoff(default: 0.10) will get a lower ranking, proportional to their fraction (eg ligand expressed in 9% of cells will be ranked higher than ligand expressed in 0.5% of cells). We opted for this weighting based on fraction, instead of removing ligands that are not expressed in more cells than this cutoff, because some interesting ligands could be removed that way. Fraction of expression is not taken into account for the prioritization if it is already higher than the cutoff. prioritizing_weights argument: "ligand_fraction". Recommended weight: 1.
Ligand spatial DE score: spatial expression specificity of the ligand. If the niche of interest is at a specific tissue location, but some of the sender cell types of that niche are also present in other locations, it can be very informative to further prioritize ligands of that sender by looking how they are DE between the spatial location of interest compared to the other locations. prioritizing_weights argument: "scaled_ligand_score_spatial". Recommended weight: 2 (or 0 if not applicable).
Receptor DE score: niche-specific expression of the receptor: by default, this the minimum logFC between the receiver of interest and all the receiver of the other niche(s). The higher the min logFC, the higher the niche-specificity of the receptor. Based on our experience, we don’t suggest to give this as high importance as the ligand DE, but this might depend on the specific case study. prioritizing_weights argument: "scaled_receptor_score". Recommended weight: 0.5 (at least 0.5, and lower than "scaled_ligand_score").
Scaled receptor expression: scaled expression of a receptor in one receiver compared to the other cell types in the dataset. This might be useful to rescue potentially interesting receptors that have a high scaled expression value, but a relatively small min logFC compared to the other niche. One reason why this logFC might be small occurs when (some) genes are not picked up efficiently by the used sequencing technology. prioritizing_weights argument: "scaled_receptor_expression_scaled". Recommended weight: 0.5.
Receptor expression fraction: Receptors that are expressed in a smaller fraction of cells of a cell type than defined by exprs_cutoff(default: 0.10) will get a lower ranking, proportional to their fraction (eg receptor expressed in 9% of cells will be ranked higher than receptor expressed in 0.5% of cells). We opted for this weighting based on fraction, instead of removing receptors that are not expressed in more cells than this cutoff, because some interesting receptors could be removed that way. Fraction of expression is not taken into account for the prioritization if it is already higher than the cutoff. prioritizing_weights argument: "receptor_fraction". Recommended weight: 1.
Receptor expression strength: this factor let us give higher weights to the most highly expressed receptor of a ligand in the receiver. This let us rank higher one member of a receptor family if it higher expressed than the other members. prioritizing_weights argument: "ligand_scaled_receptor_expression_fraction". Recommended value: 1 (minimum: 0.5).
Receptor spatial DE score: spatial expression specificity of the receptor. If the niche of interest is at a specific tissue location, but the receiver cell type of that niche is also present in other locations, it can be very informative to further prioritize receptors of that receiver by looking how they are DE between the spatial location of interest compared to the other locations. prioritizing_weights argument: "scaled_receptor_score_spatial". Recommended weight: 1 (or 0 if not applicable).
Absolute ligand activity: to further prioritize ligand-receptor pairs based on their predicted effect of the ligand-receptor interaction on the gene expression in the receiver cell type - absolute ligand activity accords to ‘a(chǎn)bsolute’ enrichment of target genes of a ligand within the affected receiver genes. prioritizing_weights argument: "scaled_activity". Recommended weight: 0, unless absolute enrichment of target genes is of specific interest.
Normalized ligand activity: to further prioritize ligand-receptor pairs based on their predicted effect of the ligand-receptor interaction on the gene expression in the receiver cell type - normalization of activity is done because we found that some datasets/conditions/niches have higher baseline activity values than others - normalized ligand activity accords to ‘relative’ enrichment of target genes of a ligand within the affected receiver genes. prioritizing_weights argument: "scaled_activity_normalized". Recommended weight: at least 1.
Prior knowledge quality of the L-R interaction: the NicheNet LR network consists of two types of interactions: L-R pairs documented in curated databases, and L-R pairs predicted based on gene annotation and PPIs. The former are categorized as ‘bona fide’ interactions. To rank bona fide interactions higher, but not exlude potentially interesting non-bona-fide ones, we give bona fide interactions a score of 1, and non-bona-fide interactions a score fof 0.5. prioritizing_weights argument: "bona_fide" Recommend weight: at least 1.

prioritizing_weights = c("scaled_ligand_score" = 5,
                         "scaled_ligand_expression_scaled" = 1,
                         "ligand_fraction" = 1,
                         "scaled_ligand_score_spatial" = 2, 
                         "scaled_receptor_score" = 0.5,
                         "scaled_receptor_expression_scaled" = 0.5,
                          "receptor_fraction" = 1, 
                         "ligand_scaled_receptor_expression_fraction" = 1,
                         "scaled_receptor_score_spatial" = 0,
                         "scaled_activity" = 0,
                         "scaled_activity_normalized" = 1,
                         "bona_fide" = 1)

Note: these settings will give substantially more weight to DE ligand-receptor pairs compared to activity. Users can change this if wanted, just like other settings can be changed if that would be better to tackle the specific biological question you want to address.

output = list(DE_sender_receiver = DE_sender_receiver, ligand_scaled_receptor_expression_fraction_df = ligand_scaled_receptor_expression_fraction_df, sender_spatial_DE_processed = sender_spatial_DE_processed, receiver_spatial_DE_processed = receiver_spatial_DE_processed,
         ligand_activities_targets = ligand_activities_targets, DE_receiver_processed_targets = DE_receiver_processed_targets, exprs_tbl_ligand = exprs_tbl_ligand,  exprs_tbl_receptor = exprs_tbl_receptor, exprs_tbl_target = exprs_tbl_target)
prioritization_tables = get_prioritization_tables(output, prioritizing_weights)

prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(receiver == niches[[1]]$receiver) %>% head(10)
## # A tibble: 10 x 37
##    niche           receiver  sender ligand_receptor ligand receptor bonafide ligand_score ligand_signific~ ligand_present ligand_expressi~
##    <chr>           <chr>     <chr>  <chr>           <chr>  <chr>    <lgl>           <dbl>            <dbl>          <dbl>            <dbl>
##  1 pEMT_High_niche Malignan~ T.cel~ PTPRC--MET      PTPRC  MET      FALSE            3.22                1              1             9.32
##  2 pEMT_High_niche Malignan~ T.cel~ PTPRC--EGFR     PTPRC  EGFR     FALSE            3.22                1              1             9.32
##  3 pEMT_High_niche Malignan~ T.cel~ PTPRC--CD44     PTPRC  CD44     FALSE            3.22                1              1             9.32
##  4 pEMT_High_niche Malignan~ T.cel~ PTPRC--ERBB2    PTPRC  ERBB2    FALSE            3.22                1              1             9.32
##  5 pEMT_High_niche Malignan~ T.cel~ PTPRC--IFNAR1   PTPRC  IFNAR1   FALSE            3.22                1              1             9.32
##  6 pEMT_High_niche Malignan~ T.cel~ TNF--TNFRSF21   TNF    TNFRSF21 TRUE             1.74                1              1             2.34
##  7 pEMT_High_niche Malignan~ Myelo~ SERPINA1--LRP1  SERPI~ LRP1     TRUE             2.52                1              1             4.83
##  8 pEMT_High_niche Malignan~ Myelo~ IL1B--IL1RAP    IL1B   IL1RAP   TRUE             1.50                1              1             1.93
##  9 pEMT_High_niche Malignan~ Myelo~ IL1RN--IL1R2    IL1RN  IL1R2    TRUE             1.62                1              1             2.07
## 10 pEMT_High_niche Malignan~ T.cel~ PTPRC--INSR     PTPRC  INSR     FALSE            3.22                1              1             9.32
## # ... with 26 more variables: ligand_expression_scaled <dbl>, ligand_fraction <dbl>, ligand_score_spatial <dbl>, receptor_score <dbl>,
## #   receptor_significant <dbl>, receptor_present <dbl>, receptor_expression <dbl>, receptor_expression_scaled <dbl>,
## #   receptor_fraction <dbl>, receptor_score_spatial <dbl>, ligand_scaled_receptor_expression_fraction <dbl>,
## #   avg_score_ligand_receptor <dbl>, activity <dbl>, activity_normalized <dbl>, scaled_ligand_score <dbl>,
## #   scaled_ligand_expression_scaled <dbl>, scaled_receptor_score <dbl>, scaled_receptor_expression_scaled <dbl>,
## #   scaled_avg_score_ligand_receptor <dbl>, scaled_ligand_score_spatial <dbl>, scaled_receptor_score_spatial <dbl>,
## #   scaled_ligand_fraction_adapted <dbl>, scaled_receptor_fraction_adapted <dbl>, scaled_activity <dbl>, ...
prioritization_tables$prioritization_tbl_ligand_target %>% filter(receiver == niches[[1]]$receiver) %>% head(10)
## # A tibble: 10 x 20
##    niche   receiver  sender  ligand_receptor ligand receptor bonafide target target_score target_signific~ target_present target_expressi~
##    <chr>   <chr>     <chr>   <chr>           <chr>  <chr>    <lgl>    <chr>         <dbl>            <dbl>          <dbl>            <dbl>
##  1 pEMT_H~ Malignan~ T.cell~ PTPRC--MET      PTPRC  MET      FALSE    EHF           1.04                 1              1             1.88
##  2 pEMT_H~ Malignan~ T.cell~ PTPRC--MET      PTPRC  MET      FALSE    GADD4~        0.836                1              1             2.42
##  3 pEMT_H~ Malignan~ T.cell~ PTPRC--MET      PTPRC  MET      FALSE    SERPI~        0.889                1              1             1.79
##  4 pEMT_H~ Malignan~ T.cell~ PTPRC--EGFR     PTPRC  EGFR     FALSE    EHF           1.04                 1              1             1.88
##  5 pEMT_H~ Malignan~ T.cell~ PTPRC--EGFR     PTPRC  EGFR     FALSE    GADD4~        0.836                1              1             2.42
##  6 pEMT_H~ Malignan~ T.cell~ PTPRC--EGFR     PTPRC  EGFR     FALSE    SERPI~        0.889                1              1             1.79
##  7 pEMT_H~ Malignan~ T.cell~ PTPRC--CD44     PTPRC  CD44     FALSE    EHF           1.04                 1              1             1.88
##  8 pEMT_H~ Malignan~ T.cell~ PTPRC--CD44     PTPRC  CD44     FALSE    GADD4~        0.836                1              1             2.42
##  9 pEMT_H~ Malignan~ T.cell~ PTPRC--CD44     PTPRC  CD44     FALSE    SERPI~        0.889                1              1             1.79
## 10 pEMT_H~ Malignan~ T.cell~ PTPRC--ERBB2    PTPRC  ERBB2    FALSE    EHF           1.04                 1              1             1.88
## # ... with 8 more variables: target_expression_scaled <dbl>, target_fraction <dbl>, ligand_target_weight <dbl>, activity <dbl>,
## #   activity_normalized <dbl>, scaled_activity <dbl>, scaled_activity_normalized <dbl>, prioritization_score <dbl>

prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(receiver == niches[[2]]$receiver) %>% head(10)
## # A tibble: 10 x 37
##    niche          receiver  sender  ligand_receptor ligand receptor bonafide ligand_score ligand_signific~ ligand_present ligand_expressi~
##    <chr>          <chr>     <chr>   <chr>           <chr>  <chr>    <lgl>           <dbl>            <dbl>          <dbl>            <dbl>
##  1 pEMT_Low_niche Malignan~ Endoth~ F8--LRP1        F8     LRP1     TRUE            0.952              1                1            2.17 
##  2 pEMT_Low_niche Malignan~ Endoth~ PLAT--LRP1      PLAT   LRP1     TRUE            0.913              1                1            2.70 
##  3 pEMT_Low_niche Malignan~ CAF_Low FGF10--FGFR2    FGF10  FGFR2    TRUE            0.385              0.8              1            1.07 
##  4 pEMT_Low_niche Malignan~ CAF_Low NLGN2--NRXN3    NLGN2  NRXN3    TRUE            0.140              0.2              1            0.269
##  5 pEMT_Low_niche Malignan~ CAF_Low RSPO3--LGR6     RSPO3  LGR6     TRUE            0.557              0.8              1            1.27 
##  6 pEMT_Low_niche Malignan~ CAF_Low COMP--SDC1      COMP   SDC1     TRUE            0.290              0.8              1            1.27 
##  7 pEMT_Low_niche Malignan~ CAF_Low SEMA3C--NRP2    SEMA3C NRP2     TRUE            0.652              1                1            1.73 
##  8 pEMT_Low_niche Malignan~ CAF_Low SLIT2--SDC1     SLIT2  SDC1     TRUE            0.494              1                1            0.846
##  9 pEMT_Low_niche Malignan~ Endoth~ IL33--IL1RAP    IL33   IL1RAP   FALSE           1.34               1                1            2.75 
## 10 pEMT_Low_niche Malignan~ CAF_Low C3--LRP1        C3     LRP1     TRUE            0.480              1                1            4.79 
## # ... with 26 more variables: ligand_expression_scaled <dbl>, ligand_fraction <dbl>, ligand_score_spatial <dbl>, receptor_score <dbl>,
## #   receptor_significant <dbl>, receptor_present <dbl>, receptor_expression <dbl>, receptor_expression_scaled <dbl>,
## #   receptor_fraction <dbl>, receptor_score_spatial <dbl>, ligand_scaled_receptor_expression_fraction <dbl>,
## #   avg_score_ligand_receptor <dbl>, activity <dbl>, activity_normalized <dbl>, scaled_ligand_score <dbl>,
## #   scaled_ligand_expression_scaled <dbl>, scaled_receptor_score <dbl>, scaled_receptor_expression_scaled <dbl>,
## #   scaled_avg_score_ligand_receptor <dbl>, scaled_ligand_score_spatial <dbl>, scaled_receptor_score_spatial <dbl>,
## #   scaled_ligand_fraction_adapted <dbl>, scaled_receptor_fraction_adapted <dbl>, scaled_activity <dbl>, ...
prioritization_tables$prioritization_tbl_ligand_target %>% filter(receiver == niches[[2]]$receiver) %>% head(10)
## # A tibble: 10 x 20
##    niche   receiver  sender  ligand_receptor ligand receptor bonafide target target_score target_signific~ target_present target_expressi~
##    <chr>   <chr>     <chr>   <chr>           <chr>  <chr>    <lgl>    <chr>         <dbl>            <dbl>          <dbl>            <dbl>
##  1 pEMT_L~ Malignan~ Endoth~ F8--LRP1        F8     LRP1     TRUE     ETV4          0.771                1              1            1.00 
##  2 pEMT_L~ Malignan~ Endoth~ PLAT--LRP1      PLAT   LRP1     TRUE     CLDN7         0.835                1              1            2.30 
##  3 pEMT_L~ Malignan~ Endoth~ PLAT--LRP1      PLAT   LRP1     TRUE     ETV4          0.771                1              1            1.00 
##  4 pEMT_L~ Malignan~ CAF_Low FGF10--FGFR2    FGF10  FGFR2    TRUE     ETV4          0.771                1              1            1.00 
##  5 pEMT_L~ Malignan~ CAF_Low FGF10--FGFR2    FGF10  FGFR2    TRUE     WNT5A         1.40                 1              1            2.01 
##  6 pEMT_L~ Malignan~ CAF_Low NLGN2--NRXN3    NLGN2  NRXN3    TRUE     CLDN5         0.979                1              1            0.991
##  7 pEMT_L~ Malignan~ CAF_Low NLGN2--NRXN3    NLGN2  NRXN3    TRUE     ETV4          0.771                1              1            1.00 
##  8 pEMT_L~ Malignan~ CAF_Low RSPO3--LGR6     RSPO3  LGR6     TRUE     DDC           0.832                1              1            0.785
##  9 pEMT_L~ Malignan~ CAF_Low RSPO3--LGR6     RSPO3  LGR6     TRUE     EGFL7         0.763                1              1            1.09 
## 10 pEMT_L~ Malignan~ CAF_Low COMP--SDC1      COMP   SDC1     TRUE     CLDN7         0.835                1              1            2.30 
## # ... with 8 more variables: target_expression_scaled <dbl>, target_fraction <dbl>, ligand_target_weight <dbl>, activity <dbl>,
## #   activity_normalized <dbl>, scaled_activity <dbl>, scaled_activity_normalized <dbl>, prioritization_score <dbl>

8. Visualization of the Differential NicheNet output

Differential expression of ligand and expression
在可視化之前，需要定義每個(gè)生態(tài)位中最重要的配體-受體對(duì)旨巷。將通過(guò)首先確定每個(gè)配體/配體-受體對(duì)在哪個(gè)生態(tài)位上找到最高分來(lái)做到這一點(diǎn)巨缘。然后獲得每個(gè)niche的前 50 個(gè)配體。

top_ligand_niche_df = prioritization_tables$prioritization_tbl_ligand_receptor %>% select(niche, sender, receiver, ligand, receptor, prioritization_score) %>% group_by(ligand) %>% top_n(1, prioritization_score) %>% ungroup() %>% select(ligand, receptor, niche) %>% rename(top_niche = niche)
top_ligand_receptor_niche_df = prioritization_tables$prioritization_tbl_ligand_receptor %>% select(niche, sender, receiver, ligand, receptor, prioritization_score) %>% group_by(ligand, receptor) %>% top_n(1, prioritization_score) %>% ungroup() %>% select(ligand, receptor, niche) %>% rename(top_niche = niche)

ligand_prioritized_tbl_oi = prioritization_tables$prioritization_tbl_ligand_receptor %>% select(niche, sender, receiver, ligand, prioritization_score) %>% group_by(ligand, niche) %>% top_n(1, prioritization_score) %>% ungroup() %>% distinct() %>% inner_join(top_ligand_niche_df) %>% filter(niche == top_niche) %>% group_by(niche) %>% top_n(50, prioritization_score) %>% ungroup() # get the top50 ligands per niche

Now we will look first at the top ligand-receptor pairs for KCs (here, we will take the top 2 scoring receptors per prioritized ligand)

receiver_oi = "Malignant_High" 

filtered_ligands = ligand_prioritized_tbl_oi %>% filter(receiver == receiver_oi) %>% pull(ligand) %>% unique()

prioritized_tbl_oi = prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(ligand %in% filtered_ligands) %>% select(niche, sender, receiver, ligand,  receptor, ligand_receptor, prioritization_score) %>% distinct() %>% inner_join(top_ligand_receptor_niche_df) %>% group_by(ligand) %>% filter(receiver == receiver_oi) %>% top_n(2, prioritization_score) %>% ungroup()

lfc_plot = make_ligand_receptor_lfc_plot(receiver_oi, prioritized_tbl_oi, prioritization_tables$prioritization_tbl_ligand_receptor, plot_legend = FALSE, heights = NULL, widths = NULL)
lfc_plot

圖片.png

Show the spatialDE as additional information

lfc_plot_spatial = make_ligand_receptor_lfc_spatial_plot(receiver_oi, prioritized_tbl_oi, prioritization_tables$prioritization_tbl_ligand_receptor, ligand_spatial = include_spatial_info_sender, receptor_spatial = include_spatial_info_receiver, plot_legend = FALSE, heights = NULL, widths = NULL)
lfc_plot_spatial

圖片.png

Ligand expression, activity and target genes
Active target gene inference - cf Default NicheNet

Now: visualization of ligand activity and ligand-target links

exprs_activity_target_plot = make_ligand_activity_target_exprs_plot(receiver_oi, prioritized_tbl_oi,  prioritization_tables$prioritization_tbl_ligand_receptor,  prioritization_tables$prioritization_tbl_ligand_target, output$exprs_tbl_ligand,  output$exprs_tbl_target, lfc_cutoff, ligand_target_matrix, plot_legend = FALSE, heights = NULL, widths = NULL)
exprs_activity_target_plot$combined_plot

圖片.png

基于此圖采呐，可以推斷出許多假設(shè)若锁，例如：“有趣的是，IL1 家族配體似乎具有誘導(dǎo)高 pEMT 和低 pEMT 惡性細(xì)胞之間的 DE 基因的活性懈万；它們主要由骨髓細(xì)胞表達(dá)拴清，這是一種 pEMT 高腫瘤特有的細(xì)胞類(lèi)型”靶病。

重要：具有高差異表達(dá)（或條件特異性）和配體活性（=靶基因富集）的配體-受體對(duì)是非常有趣的預(yù)測(cè)，作為您感興趣的細(xì)胞間通訊過(guò)程的關(guān)鍵調(diào)節(jié)因子口予！

如果該圖包含太多信息娄周，因?yàn)椴榭戳嗽S多匹配項(xiàng)（前 50 個(gè)配體），您當(dāng)然也可以為較少的配體制作此圖沪停，例如前 20 個(gè)配體煤辨。

filtered_ligands = ligand_prioritized_tbl_oi %>% filter(receiver == receiver_oi) %>% top_n(20, prioritization_score) %>% pull(ligand) %>% unique()

prioritized_tbl_oi = prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(ligand %in% filtered_ligands) %>% select(niche, sender, receiver, ligand,  receptor, ligand_receptor, prioritization_score) %>% distinct() %>% inner_join(top_ligand_receptor_niche_df) %>% group_by(ligand) %>% filter(receiver == receiver_oi) %>% top_n(2, prioritization_score) %>% ungroup() 

exprs_activity_target_plot = make_ligand_activity_target_exprs_plot(receiver_oi, prioritized_tbl_oi,  prioritization_tables$prioritization_tbl_ligand_receptor,  prioritization_tables$prioritization_tbl_ligand_target, output$exprs_tbl_ligand,  output$exprs_tbl_target, lfc_cutoff, ligand_target_matrix, plot_legend = FALSE, heights = NULL, widths = NULL)
exprs_activity_target_plot$combined_plot

圖片.png

Circos plot of prioritized ligand-receptor pairs

Because a top50 is too much to visualize in a circos plot, we will only visualize the top 15.

filtered_ligands = ligand_prioritized_tbl_oi %>% filter(receiver == receiver_oi) %>% top_n(15, prioritization_score) %>% pull(ligand) %>% unique()

prioritized_tbl_oi = prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(ligand %in% filtered_ligands) %>% select(niche, sender, receiver, ligand,  receptor, ligand_receptor, prioritization_score) %>% distinct() %>% inner_join(top_ligand_receptor_niche_df) %>% group_by(ligand) %>% filter(receiver == receiver_oi) %>% top_n(2, prioritization_score) %>% ungroup() 

colors_sender = brewer.pal(n = prioritized_tbl_oi$sender %>% unique() %>% sort() %>% length(), name = 'Spectral') %>% magrittr::set_names(prioritized_tbl_oi$sender %>% unique() %>% sort())
colors_receiver = c("lavender")  %>% magrittr::set_names(prioritized_tbl_oi$receiver %>% unique() %>% sort())

circos_output = make_circos_lr(prioritized_tbl_oi, colors_sender, colors_receiver)

圖片.png

Interpretation of these results

大多數(shù)排名靠前的差異 L-R 對(duì)似乎來(lái)自?xún)H存在于 pEMT 高腫瘤中的細(xì)胞類(lèi)型。這可能部分是由于生物學(xué)（一種情況下的獨(dú)特細(xì)胞類(lèi)型可能非常重要）木张，但也可能是由于優(yōu)先排序的方式以及這些獨(dú)特的細(xì)胞類(lèi)型在其中沒(méi)有“對(duì)應(yīng)物”的事實(shí)其他niche众辨。

因?yàn)楣撬杓?xì)胞和T細(xì)胞與腫瘤微環(huán)境中的其他細(xì)胞有很大不同，它們的配體會(huì)表現(xiàn)出強(qiáng)烈的差異表達(dá)舷礼。與來(lái)自相同細(xì)胞類(lèi)型但不同生態(tài)位/條件的細(xì)胞之間的差異表達(dá)（pEMT-high 中的 CAF 與 pEMT 中的 CAF pEMT-低）鹃彻。所以結(jié)論：Differential NicheNet 的一個(gè)優(yōu)勢(shì)在于它可以處理特定條件的細(xì)胞類(lèi)型，但用戶(hù)應(yīng)該知道最終的一般分?jǐn)?shù)可能會(huì)偏向于特定條件的發(fā)送者細(xì)胞類(lèi)型妻献。因此蛛株，如果有一個(gè)案例研究，其中某些發(fā)送細(xì)胞類(lèi)型是特定于條件的育拨，建議還查看每個(gè)發(fā)送細(xì)胞類(lèi)型的top LR 對(duì)谨履。

Visualization for the other condition: pEMT-low

receiver_oi = "Malignant_Low"  
filtered_ligands = ligand_prioritized_tbl_oi %>% filter(receiver == receiver_oi) %>% top_n(50, prioritization_score) %>% pull(ligand) %>% unique()

prioritized_tbl_oi = prioritization_tables$prioritization_tbl_ligand_receptor %>% filter(ligand %in% filtered_ligands) %>% select(niche, sender, receiver, ligand,  receptor, ligand_receptor, prioritization_score) %>% distinct() %>% inner_join(top_ligand_receptor_niche_df) %>% group_by(ligand) %>% filter(receiver == receiver_oi) %>% top_n(2, prioritization_score) %>% ungroup() 

lfc_plot = make_ligand_receptor_lfc_plot(receiver_oi, prioritized_tbl_oi, prioritization_tables$prioritization_tbl_ligand_receptor, plot_legend = FALSE, heights = NULL, widths = NULL)
lfc_plot

圖片.png

好了，已經(jīng)分享給大家了熬丧，生活很好笋粟，有你更好，百度文庫(kù)出現(xiàn)了大量抄襲我的文章析蝴，對(duì)此我深表無(wú)奈害捕，我寫(xiě)的文章，別人掛上去賺錢(qián)闷畸，抄襲可恥吨艇，掛到百度文庫(kù)的人更可恥

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個(gè)濱河市腾啥，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌冯吓，老刑警劉巖倘待，帶你破解...
沈念sama閱讀 206,013評(píng)論 6贊 481
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場(chǎng)離奇詭異组贺，居然都是意外死亡凸舵，警方通過(guò)查閱死者的電腦和手機(jī)，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,205評(píng)論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門(mén)失尖，熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)啊奄，“玉大人渐苏，你說(shuō)我怎么就攤上這事」娇洌” “怎么了琼富？”我有些...
開(kāi)封第一講書(shū)人閱讀 152,370評(píng)論 0贊 342
道士緝兇錄：失蹤的賣(mài)姜人
文/不壞的土叔我叫張陵，是天一觀的道長(zhǎng)庄新。經(jīng)常有香客問(wèn)我鞠眉，道長(zhǎng)，這世上最難降的妖魔是什么择诈？我笑而不...
開(kāi)封第一講書(shū)人閱讀 55,168評(píng)論 1贊 278
?港島之戀（遺憾婚禮）
正文為了忘掉前任械蹋，我火速辦了婚禮，結(jié)果婚禮上羞芍，老公的妹妹穿的比我還像新娘哗戈。我一直安慰自己，他們只是感情好荷科，可當(dāng)我...
茶點(diǎn)故事閱讀 64,153評(píng)論 5贊 371
惡毒庶女頂嫁案：這布局不是一般人想出來(lái)的
文/花漫我一把揭開(kāi)白布唯咬。她就那樣靜靜地躺著，像睡著了一般步做。火紅的嫁衣襯著肌膚如雪副渴。梳的紋絲不亂的頭發(fā)上，一...
開(kāi)封第一講書(shū)人閱讀 48,954評(píng)論 1贊 283
城市分裂傳說(shuō)
那天全度，我揣著相機(jī)與錄音煮剧，去河邊找鬼。笑死将鸵，一個(gè)胖子當(dāng)著我的面吹牛勉盅，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播顶掉，決...
沈念sama閱讀 38,271評(píng)論 3贊 399
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開(kāi)眼草娜，長(zhǎng)吁一口氣：“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼！你這毒婦竟也來(lái)了痒筒？” 一聲冷哼從身側(cè)響起宰闰，我...
開(kāi)封第一講書(shū)人閱讀 36,916評(píng)論 0贊 259
萬(wàn)榮殺人案實(shí)錄
序言：老撾萬(wàn)榮一對(duì)情侶失蹤，失蹤者是張志新（化名）和其女友劉穎簿透，沒(méi)想到半個(gè)月后移袍，有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 43,382評(píng)論 1贊 300
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡老充，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 35,877評(píng)論 2贊 323
?白月光啟示錄
正文我和宋清朗相戀三年葡盗，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片啡浊。...
茶點(diǎn)故事閱讀 37,989評(píng)論 1贊 333
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡觅够，死狀恐怖胶背，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情喘先，我是刑警寧澤钳吟，帶...
沈念sama閱讀 33,624評(píng)論 4贊 322
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站苹祟，受9級(jí)特大地震影響砸抛，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜树枫，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 39,209評(píng)論 3贊 307
男人毒藥：我在死后第九天來(lái)索命
文/蒙蒙一直焙、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧砂轻，春花似錦奔誓、人聲如沸。這莊子的主人今日做“春日...
開(kāi)封第一講書(shū)人閱讀 30,199評(píng)論 0贊 19
一樁弒父案厨喂，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽(yáng)。三九已至庄呈，卻和暖如春蜕煌，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背诬留。一陣腳步聲響...
開(kāi)封第一講書(shū)人閱讀 31,418評(píng)論 1贊 260
情欲美人皮
我被黑心中介騙來(lái)泰國(guó)打工斜纪，沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留，地道東北人文兑。一個(gè)月前我還...
沈念sama閱讀 45,401評(píng)論 2贊 352
代替公主和親
正文我出身青樓盒刚，卻偏偏與公主長(zhǎng)得像，于是被迫代替她去往敵國(guó)和親绿贞。傳聞我的和親對(duì)象是個(gè)殘疾皇子因块，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 42,700評(píng)論 2贊 345

10X單細(xì)胞（10X空間轉(zhuǎn)錄組）通訊分析之總結(jié)Nichenet多條件差異通訊

作者舵盈，追風(fēng)少年i

周三了论矾，好好學(xué)習(xí)，天天搬磚沛简，人生沒(méi)有倒帶齐鲤，前進(jìn)才是真理

0. Read in the expression data of interest, and the NicheNet ligand-receptor network and ligand-target matrix

Read in the NicheNet ligand-receptor network and ligand-target matrix(讀取數(shù)據(jù)庫(kù))

人鼠基因轉(zhuǎn)換

1、Define the niches/microenvironments of interest

2. Calculate differential expression between the niches

3. Optional: Calculate differential expression between the different spatial regions(針對(duì)有空間的數(shù)據(jù))

4. Calculate ligand activities and infer active ligand-target links

5. Calculate (scaled) expression of ligands, receptors and targets across cell types of interest (log expression values and expression fractions)

6. Expression fraction and receptor

7. Prioritization of ligand-receptor and ligand-target links

8. Visualization of the Differential NicheNet output

推薦閱讀更多精彩內(nèi)容