作者楣颠,Evil Genius
計算原理仍然是inferCNV
計算步驟
1况木、Subtract the reference gene expression from all cells. Since the data is in log space, this effectively computes the log fold change. If references for multiple categories are available (i.e. multiple values are specified to reference_cat), the log fold change is “bounded”:
- Compute the mean gene expression for each category separately.
- Values that are within the minimum and the maximum of the mean of all references, receive a log fold change of 0, since they are not considered different from the background.
- From values smaller than the minimum of the mean of all references, subtract that minimum.
- From values larger than the maximum of the mean of all references, subtract that maximum.
This procedure avoids calling false positive CNV regions due to cell-type specific expression of clustered gene regions (e.g. Immunoglobulin- or HLA genes in different immune cell types).
2、Clip the fold changes at -lfc_cap and +lfc_cap.
3镐牺、Smooth the gene expression by genomic position. Computes the average over a running window of length window_size. Compute only every nth window to save time & space, where n = step.
4炫掐、Center the smoothed gene expression by cell, by subtracting the median of each cell from each cell.
5、Perform noise filtering. Values < dynamic_theshold * STDDEV are set to 0, where STDDEV is the standard deviation of the smoothed gene expression
6睬涧、Smooth the final result using a median filter.
實現(xiàn)目標(biāo)
CNV聚類
腳本封裝版
#! /bin/python
### 20240618
### zhaoyunfei
### https://infercnvpy.readthedocs.io/en/latest/notebooks/tutorial_3k.html
import pandas as pd
import numpy as np
import sklearn
import scanpy as sc
import infercnvpy as cnv
import matplotlib.pyplot as plt
import warnings
import argparse
warnings.simplefilter("ignore")
sc.settings.set_figure_params(figsize=(5, 5))
parse=argparse.ArgumentParser(description='scvelo')
parse.add_argument('--input',help='the anndata',type=str,required = True)
parse.add_argument('--outdir',help='the analysis dir',type=str)
parse.add_argument('--sample',help='the sample name',type=str,required = True)
parse.add_argument('--ref',help='the ref celltype;eg T,B',type=str,required = True)
parse.add_argument('--windiow',help='the bin size',type=int,default = 250)
argv = parse.parse_args()
adata = argv.input
outdir = argv.outdir
sample = argv.sample
ref = argv.ref
windiow = argv.windiow
adata = sc.read(adata)
sc.pp.normalize_total(adata)
adata.var['symbol'] = adata.var.index
adata.var.index = adata.var['gene_ids']
gene_order = pd.read_csv('/home/samples/DB/scRNA/inferCNV/gene_order.txt',sep = '\t',index_col = 0)
gene_intersect = list(set(adata.var.index) & set(gene_order.index))
adata = adata[:,gene_intersect]
gene_order = gene_order.loc[gene_intersect,:]
adata.var.index.name = 'gene'
tmp = pd.merge(adata.var,gene_order,on = 'gene_ids')
tmp.index = tmp['symbol']
adata.var = tmp
adata.var['chromosome'] = 'chr' + adata.var['chr']
adata.var['ensg'] = adata.var['gene_ids']
adata.var['start'] = adata.var['Start']
adata.var['end'] = adata.var['End']
reference = ref.split(',')
cnv.tl.infercnv(adata,reference_key="celltype",reference_cat=reference,window_size=windiow)
cnv.pl.chromosome_heatmap(adata, groupby="celltype")
plt.savefig(outdir + '/' + sample + '.cnv.heatmap.png',bbox_inches = 'tight')
####cnv cluster
cnv.tl.pca(adata)
cnv.pp.neighbors(adata)
cnv.tl.leiden(adata)
cnv.pl.chromosome_heatmap(adata, groupby="cnv_leiden", dendrogram=True)
plt.savefig(outdir + '/' + sample + '.cnv.leiden.heatmap.png',bbox_inches = 'tight')
####UMAP plot of CNV profiles
cnv.tl.umap(adata)
cnv.tl.cnv_score(adata)
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(11, 11))
ax4.axis("off")
cnv.pl.umap(adata,color="cnv_leiden",legend_loc="on data",legend_fontoutline=2,ax=ax1,show=False,)
cnv.pl.umap(adata, color="cnv_score", ax=ax2, show=False)
cnv.pl.umap(adata, color="celltype", ax=ax3)
plt.savefig(outdir + '/' + sample + '.cnv.celltype.umap.CNV.png',bbox_inches = 'tight')
####visualize the CNV score and clusters on the transcriptomics-based UMAP plot
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(12, 11), gridspec_kw={"wspace": 0.5})
ax4.axis("off")
sc.pl.umap(adata, color="cnv_leiden", ax=ax1, show=False)
sc.pl.umap(adata, color="cnv_score", ax=ax2, show=False)
sc.pl.umap(adata, color="celltype", ax=ax3)
plt.savefig(outdir + '/' + sample + '.cnv.celltype.umap.RNA.png',bbox_inches = 'tight')
####Classifying tumor cells
tumor = []
for i in adata.obs['cnv_score']:
if i > 0.2 :
tumor.append('tumor')
else :
tumor.append('normal')
adata.obs["cnv_status"] = tumor
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5), gridspec_kw={"wspace": 0.5})
cnv.pl.umap(adata, color="cnv_status", ax=ax1, show=False)
sc.pl.umap(adata, color="cnv_status", ax=ax2)
plt.savefig(outdir + '/' + sample + '.cnv.status.png',bbox_inches = 'tight')
cnv.pl.chromosome_heatmap(adata[adata.obs["cnv_status"] == "tumor", :])
plt.savefig(outdir + '/' + sample + 'tumor.cnv.status.png',bbox_inches = 'tight')
cnv.pl.chromosome_heatmap(adata[adata.obs["cnv_status"] == "normal", :])
plt.savefig(outdir + '/' + sample + 'normal.cnv.status.png',bbox_inches = 'tight')
adata.write(outdir + '/' + sample + '.h5ad')
生活很好脊阴,有你更好
第一課:單細(xì)胞聯(lián)合突變信息分析
第二課:單細(xì)胞數(shù)據(jù)分析之新版 velocity
第三課:單細(xì)胞數(shù)據(jù)分析之 CNV 聚類
第四課:單細(xì)胞通訊分析大匯總(cellphoneDB常潮、NicheNet、CellChat)
第五課:單細(xì)胞聯(lián)合 VDJ分析
第六課:單細(xì)胞聯(lián)合 ATAC 分析
第七課:空間轉(zhuǎn)錄組整合分析與形態(tài)學(xué)判定(空間注釋)
第八課:單細(xì)胞空間聯(lián)合分析(cell2location)與區(qū)域細(xì)胞富集
第九課:空間細(xì)胞類型共定位與細(xì)胞通路同定位
第10課:單細(xì)胞空間聯(lián)合分析之 MIA
第 11 課:空間生態(tài)位通訊與 COMMOT
第 12 課:空間轉(zhuǎn)錄組之分子 niche 與細(xì)胞 niche
第 13 課:空間網(wǎng)絡(luò)(基因、細(xì)胞蝗碎、通路網(wǎng)絡(luò))
第14課:空間轉(zhuǎn)錄組微生物的檢測運用
第 15 課:空間 VDJ(暫定)
第 16 課:空間 CNV burden 與 CNV 聚類
第 17 課:空間鄰域分析(banksy、CellNeighborEx)颁湖,分子鄰域與細(xì)胞鄰域
第 18 課:空間轉(zhuǎn)錄組 hotspot
第 19 課:空間細(xì)胞“社區(qū)”(空間細(xì)胞“unit”)
第20課:空間基因梯度
第21課:空間轉(zhuǎn)錄組的活性空間域
第 22 課:空間轉(zhuǎn)錄組之 velocity
第 23 課:空間軌跡
第24課:細(xì)胞鄰域特征評分(異型細(xì)胞網(wǎng)絡(luò))
第25課:空間分子生態(tài)位差異分析
第 26 課:高精度 HD 平臺數(shù)據(jù)分析框架
第 27 課:高精度 HD 平臺分析內(nèi)容與 niche
第 28 課:Xenium 分析框架
第29 課:原位數(shù)據(jù)細(xì)胞分割導(dǎo)論
第 30 課:Xenium 生態(tài)位分析與微環(huán)境分析