本次學(xué)習(xí)資料來源,結(jié)合視頻觀看效果更佳,視頻相關(guān)代碼如下:
- 視頻如下:https://www.youtube.com/watch?v=uvyG9yLuNSE
- 代碼:https://github.com/mousepixels/sanbomics_scripts
- 主代碼:https://github.com/mousepixels/sanbomics_scripts/blob/main/single_cell_analysis_complete_class.ipynb
文獻(xiàn)中的Fig1如下:
如果要繪制Fig1悦即,需要對上次筆記中的注釋再進(jìn)行進(jìn)一步的詳細(xì)注釋彼宠,上次注釋結(jié)果:
數(shù)據(jù)讀取
import scanpy as sc
import scvi
import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# 注意dir改成自己的路徑
dir = '/Pub/Users/zhangjuan/data/GSE171524/'
adata = sc.read_h5ad(dir+'integrated_anno.h5ad')
adata
詳細(xì)注釋
marker采用來自數(shù)據(jù)庫:PanglaoDB https://panglaodb.se/
首先是B浴捆,T/NK細(xì)胞睛驳,有:
- CD4+ T cells:CD4
- CD8+ T cells:CD8A, CD8B
- NK/T cells:NCAM1
- Plasma cells:MZB1
上皮群:
- AT1:AGER
- AT2:SFTPC
- Airway epithelial:SEC14L3, CDH1
髓系:
- Macrophages:MRC1
- Dendritic cells:ZBTB46
- Monocytes:APOBEC3A
基質(zhì)細(xì)胞:
- smooth muscle cells:RGS5烙心,ACTA2
markers = adata.uns['markers']
# 看某一個基因的表達(dá)情況
markers[markers.names=="RGS5"]
#, layer = 'scvi_normalized'
# , vmax = 5
sc.pl.umap(adata, color = ["RGS5"], frameon = False, layer = 'scvi_normalized', vmax =2)
plt.savefig(dir+"03-intergration_umap_gene_RGS5.png")
先給出每個cluster編號以及需要填充的空格,在后續(xù)的操作中這個地方會填上每個注釋結(jié)果:
for x in range(0,35):
print(f'"{x}":"", ')
cell_type_sub = {"0":"CD4+ T cells",
"1":"AT1",
"2":"Macrophages",
"3":"Fibroblast",
"4":"Macrophages",
"5":"CD8+ T cells",
"6":"Macrophages",
"7":"AT2",
"8":"Endothelial",
"9":"Fibroblast",
"10":"Macrophages",
"11":"Plasma cells",
"12":"AT2",
"13":"Fibroblast",
"14":"Fibroblast",
"15":"AT2",
"16":"Airway epithelial",
"17":"Macrophages",
"18":"Airway epithelial",
"19":"AT2",
"20":"Neuronal Cell",
"21":"Airway epithelial",
"22":"B Cell",
"23":"Mast Cell",
"24":"Cycling T/NK",
"25":"Plasma cells",
"26":"DCs",
"27":"Endothelial",
"28":"smooth muscle cells",
"29":"Monocytes",
"30":"Erythroid",
"31":"B Cell",
"32":"Neuronal Cell",
"33":"Macrophages"
}
注釋后:
adata.obs['cell type_sub'] = adata.obs.leiden.map(cell_type_sub)
sc.pl.umap(adata, color = ['cell type_sub'], frameon = False, legend_loc = "on data")
plt.tight_layout()
plt.savefig(dir+"04-intergration_umap_anno_sub.png", dpi=300)
# 保存
adata.write_h5ad(dir + 'integrated_anno.h5ad')
詳細(xì)注釋結(jié)果如下:
繪制C乏沸、D圖
首先對細(xì)胞進(jìn)行計數(shù):
# adata = sc.read_h5ad('integrated.h5ad')
adata.obs.Sample.unique().tolist()
# 添加一列細(xì)胞的樣本來源
def map_condition(x):
if 'cov' in x:
return 'COVID19'
else:
return 'Control'
adata.obs['condition'] = adata.obs.Sample.map(map_condition)
adata.obs
結(jié)果圖:
# 每個樣本中的細(xì)胞數(shù)統(tǒng)計
num_tot_cells = adata.obs.groupby(['Sample']).count()
num_tot_cells = dict(zip(num_tot_cells.index, num_tot_cells.doublet))
num_tot_cells
# {'C51ctr': 10934, 'C52ctr': 3967, 'C53ctr': 6076, 'C54ctr': 3860, 'C55ctr': 5110, 'C56ctr': 3609, 'C57ctr': 4307, 'L01cov': 2737,'L03cov': 3604, 'L04cov': 3056, 'L04covaddon': 4073, 'L05cov': 2435, 'L06cov': 5657, 'L07cov': 4217, 'L08cov': 3473, 'L09cov': 3075, 'L10cov': 2713, 'L11cov': 2531, 'L12cov': 3259, 'L13cov': 4244, 'L15cov': 3516, 'L16cov': 1600, 'L17cov': 3937, 'L18cov': 2392, 'L19cov': 2135, 'L21cov': 2961, 'L22cov': 5786}
cell_type_counts = adata.obs.groupby(['Sample', 'condition', 'cell type']).count()
cell_type_counts = cell_type_counts[cell_type_counts.sum(axis = 1) > 0].reset_index()
cell_type_counts = cell_type_counts[cell_type_counts.columns[0:4]]
cell_type_counts
# Sample condition cell type doublet
# 0 C51ctr Control B Cell 70
# 1 C51ctr Control Endothelial 1153
# 2 C51ctr Control Epithelial Cell 3711
# 3 C51ctr Control Fibroblast 1497
# 4 C51ctr Control Mast Cell 290
# .. ... ... ... ...
# 246 L22cov COVID19 Mast Cell 7
# 247 L22cov COVID19 Myeloid Cell 1663
# 248 L22cov COVID19 Neuronal Cell 83
# 249 L22cov COVID19 T Cell 1379
# 250 L22cov COVID19 pDC 510
# [251 rows x 4 columns]
# 計算頻率
cell_type_counts['total_cells'] = cell_type_counts.Sample.map(num_tot_cells).astype(int)
cell_type_counts['frequency'] = cell_type_counts.doublet / cell_type_counts.total_cells
cell_type_counts
每種細(xì)胞類型在COVID19與control兩種條件下的頻率差異:
繪圖:
plt.figure(figsize = (10,4))
ax = sns.boxplot(data = cell_type_counts, x = 'cell type', y = 'frequency', hue = 'condition')
plt.xticks(rotation = 35, rotation_mode = 'anchor', ha = 'right')
plt.tight_layout()
plt.savefig(dir+"05-intergration_Fig1D.png", dpi=300)
結(jié)果圖如下:
C圖如下:
sc.pl.umap(adata, color = ['cell type_sub','condition'], frameon = False, legend_loc = "on data")
plt.savefig(dir+"05-intergration_Fig1B_C.png", dpi=300)
結(jié)果圖如下:
下期見~