作者,Evil Genius
大家科研不要把自己逼得太緊垛玻,適當(dāng)放松是為了更好的工作割捅,比如最近很好的悟空,3個(gè)人湊錢買了一份玩一玩帚桩,第一關(guān)都過不去亿驾。
今日參考文獻(xiàn),
雖然作者是中國(guó)人账嚎,但卻是美國(guó)的課題組莫瞬,
我早已工作了,分享了很多的分析方法郭蕉,但是基本不可能有機(jī)會(huì)參與原創(chuàng)性的工作了疼邀,大家基本上常用的軟件,比如Seurat召锈、cellchat等等旁振,都是在美國(guó)人的中國(guó)人開發(fā)的,說明了只要有好的環(huán)境涨岁,國(guó)人的才智簡(jiǎn)直無可匹敵拐袜。
分析目標(biāo),繪制癌細(xì)胞和TME成分梢薪,對(duì)細(xì)胞類型和狀態(tài)進(jìn)行分層蹬铺,并分析細(xì)胞共定位。
知識(shí)背景
- 基于下一代測(cè)序(NGS)的平臺(tái)秉撇,如Visium甜攀、GeoMx秋泄、Slide-Seq「傲冢基于雜交的方法印衔,如MERFISH、seqFISH和CosMx姥敛〖楸海基于NGS的ST方法覆蓋了整個(gè)轉(zhuǎn)錄組,但不是單細(xì)胞分辨率彤敛,而基于原位雜交的方法提供了優(yōu)越的空間分辨率与帆,但僅限于基因組的一小部分,限制了它們?cè)诨诎l(fā)現(xiàn)的研究中的潛力(不過5000+探針還是可以的)墨榄。
- 通過結(jié)合空間基因表達(dá)玄糟、組織組織學(xué)和癌癥和TME細(xì)胞的先驗(yàn)知識(shí),系統(tǒng)地分析癌細(xì)胞和TME細(xì)胞袄秩。
METI 分析框架
- 重點(diǎn)關(guān)注從正常細(xì)胞到癌前細(xì)胞再到惡性細(xì)胞的進(jìn)展阵翎,同時(shí)也檢查每個(gè)組織切片內(nèi)的淋巴細(xì)胞。
- METI的目標(biāo)是精確識(shí)別各種細(xì)胞類型及其在TME中的各自狀態(tài)之剧。
模塊1郭卫,METI識(shí)別正常細(xì)胞和癌前細(xì)胞.
模塊2,METI識(shí)別腫瘤細(xì)胞富集區(qū)并表征其細(xì)胞狀態(tài)的異質(zhì)性
模塊3, T細(xì)胞的空間定位
模塊4,中識(shí)別其他免疫細(xì)胞背稼,包括中性粒細(xì)胞贰军、B細(xì)胞、漿細(xì)胞和巨噬細(xì)胞蟹肘。
模塊5词疼、成纖維細(xì)胞的分析
模塊一、繪制正常細(xì)胞和癌前細(xì)胞
分析思路:病理學(xué)家標(biāo)注 + 正常細(xì)胞的形態(tài)學(xué)形狀 + 基因表達(dá)數(shù)據(jù)
-
這種檢測(cè)不能通過流行的空間聚類方法單獨(dú)實(shí)現(xiàn)
模塊二帘腹、癌細(xì)胞結(jié)構(gòu)域和異質(zhì)性的鑒定
大多數(shù)實(shí)體瘤起源于上皮細(xì)胞贰盗,被稱為癌,包括胃癌阳欲、肺癌童太、膀胱癌、乳腺癌胸完、前列腺癌和結(jié)腸癌,而其他一些實(shí)體瘤起源于其他類型的組織翘贮,包括肉瘤和黑色素瘤赊窥。無論其細(xì)胞來源如何,了解惡性細(xì)胞的分子特征和細(xì)胞異質(zhì)性對(duì)于揭示腫瘤生長(zhǎng)狸页、侵襲锨能、轉(zhuǎn)移和治療反應(yīng)的機(jī)制至關(guān)重要扯再。定量生物組織內(nèi)細(xì)胞的空間分布和密度對(duì)于各種應(yīng)用至關(guān)重要,特別是在病理學(xué)和腫瘤學(xué)領(lǐng)域址遇。雖然基因表達(dá)提供了一個(gè)molecular lens熄阻,但相關(guān)的H&E圖像可以用來測(cè)量空間細(xì)胞分布和密度。與模塊1一樣倔约,METI接下來進(jìn)行腫瘤細(xì)胞核分割秃殉,生成三維腫瘤細(xì)胞密度圖,直觀地描繪了癌細(xì)胞的空間分布和密度浸剩。該功能用于傳達(dá)感興趣的細(xì)胞類型的空間分布钾军、密度和模式。
模塊三绢要、T細(xì)胞定位和表型
模塊4吏恭、深入分析其他免疫細(xì)胞,這其中就包括了單細(xì)胞難以捕獲到的中性粒細(xì)胞重罪。
模塊5樱哼、成纖維細(xì)胞的分析
示例代碼在 https://github.com/Flashiness/METI。
我們簡(jiǎn)單看一下剿配,注意如果運(yùn)用到自己的課題還是需要認(rèn)真思考的搅幅。
##pip install METIforST
#Note: you need to make sure that the pip is for python3,or you can install METI by
##python3 -m pip install METIforST==0.2
import torch
import csv,re, time
import pickle
import random
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
import numpy as np
from scipy import stats
from scipy.sparse import issparse
import scanpy as sc
import matplotlib.colors as clr
import matplotlib.pyplot as plt
import cv2
import TESLA as tesla
from IPython.display import Image
import scipy.sparse
import scanpy as sc
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scanpy import read_10x_h5
import PIL
from PIL import Image as IMAGE
import os
import METI as meti
import tifffile
os.environ['KMP_DUPLICATE_LIB_OK']='True'
讀取數(shù)據(jù)
adata=sc.read_visium("/tutorial/data/Spaceranger/")
spatial=pd.read_csv("/tutorial/data/Spaceranger/tissue_positions_list.csv",sep=",",header=None,na_filter=False,index_col=0)
adata.var_names_make_unique()
adata.var["mt"] = adata.var_names.str.startswith("MT-")
sc.pp.calculate_qc_metrics(adata, qc_vars=["mt"], inplace=True)
plt.rcParams["figure.figsize"] = (8, 8)
sc.pl.spatial(adata, img_key="hires", color=["total_counts", "n_genes_by_counts"], size = 1.5, save = 'UMI_count.png')
轉(zhuǎn)換數(shù)據(jù)
#================== 3. Read in data ==================#
#Read original 10x_h5 data and save it to h5ad
from scanpy import read_10x_h5
adata = read_10x_h5("../tutorial/data/filtered_feature_bc_matrix.h5")
spatial=pd.read_csv("../tutorial/data/tissue_positions_list.csv",sep=",",header=None,na_filter=False,index_col=0)
adata.obs["x1"]=spatial[1]
adata.obs["x2"]=spatial[2]
adata.obs["x3"]=spatial[3]
adata.obs["x4"]=spatial[4]
adata.obs["x5"]=spatial[5]
adata.obs["array_x"]=adata.obs["x2"]
adata.obs["array_y"]=adata.obs["x3"]
adata.obs["pixel_x"]=adata.obs["x4"]
adata.obs["pixel_y"]=adata.obs["x5"]
#Select captured samples
adata=adata[adata.obs["x1"]==1]
adata.var_names=[i.upper() for i in list(adata.var_names)]
adata.var["genename"]=adata.var.index.astype("str")
adata.write_h5ad("../tutorial/data/1957495_data.h5ad")
#Read in gene expression and spatial location
counts=sc.read("../tutorial/data/1957495_data.h5ad")
#Read in hitology image
PIL.Image.MAX_IMAGE_PIXELS = None
img = IMAGE.open(r"../tutorial/data/histology.tif")
img = np.array(img)
#if your image has 4 dimensions, only keep first 3 dims
img=img[...,:3]
resize_factor=1000/np.min(img.shape[0:2])
resize_width=int(img.shape[1]*resize_factor)
resize_height=int(img.shape[0]*resize_factor)
counts.var.index=[i.upper() for i in counts.var.index]
counts.var_names_make_unique()
counts.raw=counts
sc.pp.log1p(counts) # impute on log scale
if issparse(counts.X):counts.X=counts.X.A
###Contour detection
# Detect contour using cv2
cnt=tesla.cv2_detect_contour(img, apertureSize=5,L2gradient = True)
binary=np.zeros((img.shape[0:2]), dtype=np.uint8)
cv2.drawContours(binary, [cnt], -1, (1), thickness=-1)
#Enlarged filter
cnt_enlarged = tesla.scale_contour(cnt, 1.05)
binary_enlarged = np.zeros(img.shape[0:2])
cv2.drawContours(binary_enlarged, [cnt_enlarged], -1, (1), thickness=-1)
img_new = img.copy()
cv2.drawContours(img_new, [cnt], -1, (255), thickness=20)
img_new=cv2.resize(img_new, ((resize_width, resize_height)))
cv2.imwrite('../tutorial/data/cnt_1957495.jpg', img_new)
Image(filename='../tutorial/data/cnt_1957495.jpg')
####Gene expression enhancement
#Set size of superpixel
res=40
# Note, if the numer of superpixels is too large and take too long, you can increase the res to 100
enhanced_exp_adata=tesla.imputation(img=img, raw=counts, cnt=cnt, genes=counts.var.index.tolist(), shape="None", res=res, s=1, k=2, num_nbs=10)
enhanced_exp_adata.write_h5ad("../tutorial/data/enhanced_exp.h5ad")
####Goblet marker gene expression
#================ determine if markers are in ===============#
enhanced_exp_adata=sc.read("..tutorial/data/enhanced_exp.h5ad")
markers = ["MS4A10", "MGAM", "CYP4F2", "XPNPEP2", "SLC5A9", "SLC13A2", "SLC28A1", "MEP1A", "ABCG2", "ACE2"]
for i in range(len(markers)):
if markers[i] in enhanced_exp_adata.var.index: print("yes")
else: print(markers[i])
save_dir="..tutorial/data/Goblet/"
if not os.path.exists(save_dir):os.mkdir(save_dir)
#================ Plot gene expression image ===============#
markers = ["MS4A10", "MGAM", "CYP4F2", "XPNPEP2", "SLC5A9", "SLC13A2", "SLC28A1", "MEP1A", "ABCG2", "ACE2"]
for i in range(len(markers)):
cnt_color = clr.LinearSegmentedColormap.from_list('magma', ["#000003", "#3b0f6f", "#8c2980", "#f66e5b", "#fd9f6c", "#fbfcbf"], N=256)
g=markers[i]
enhanced_exp_adata.obs[g]=enhanced_exp_adata.X[:,enhanced_exp_adata.var.index==g]
fig=sc.pl.scatter(enhanced_exp_adata,alpha=1,x="y",y="x",color=g,color_map=cnt_color,show=False,size=10)
fig.set_aspect('equal', 'box')
fig.invert_yaxis()
plt.gcf().set_dpi(600)
fig.figure.show()
plt.savefig(save_dir + str(markers[i]) + ".png", dpi=600)
plt.close()
#================ Plot meta gene expression image ===============#
enhanced_exp_adata=sc.read("/Users/jjiang6/Desktop/UTH/MDA GRA/Spatial transcriptome/Cell Segmentation/With Jian Hu/S1_54078/TESLA/enhanced_exp.h5ad")
genes = ["MS4A10", "MGAM", "CYP4F2", "XPNPEP2", "SLC5A9", "SLC13A2", "SLC28A1", "MEP1A", "ABCG2", "ACE2"]
sudo_adata = meti.meta_gene_plot(img=img,
binary=binary,
sudo_adata=enhanced_exp_adata,
genes=genes,
resize_factor=resize_factor,
target_size="small")
cnt_color = clr.LinearSegmentedColormap.from_list('magma', ["#000003", "#3b0f6f", "#8c2980", "#f66e5b", "#fd9f6c", "#fbfcbf"], N=256)
fig=sc.pl.scatter(sudo_adata,alpha=1,x="y",y="x",color='meta',color_map=cnt_color,show=False,size=5)
fig.set_aspect('equal', 'box')
fig.invert_yaxis()
plt.gcf().set_dpi(600)
fig.figure.show()
plt.savefig(save_dir + "Goblet_meta.png", dpi=600)
plt.close()
Region annotation
genes=["MS4A10", "MGAM", "CYP4F2", "XPNPEP2", "SLC5A9", "SLC13A2", "SLC28A1", "MEP1A", "ABCG2", "ACE2"]
genes=list(set([i for i in genes if i in enhanced_exp_adata.var.index ]))
#target_size can be set to "small" or "large".
pred_refined, target_clusters, c_m=meti.annotation(img=img,
binary=binary,
sudo_adata=enhanced_exp_adata,
genes=genes,
resize_factor=resize_factor,
num_required=1,
target_size="small")
#Plot
ret_img=tesla.visualize_annotation(img=img,
binary=binary,
resize_factor=resize_factor,
pred_refined=pred_refined,
target_clusters=target_clusters,
c_m=c_m)
cv2.imwrite(save_dir + 'IME.jpg', ret_img)
Image(filename=save_dir + 'IME.jpg')
#=====================================Convert to spot level============================================#
adata.obs["color"]=extract_color(x_pixel=(np.array(adata.obs["pixel_x"])*resize_factor).astype(np.int64),
y_pixel=(np.array(adata.obs["pixel_y"])*resize_factor).astype(np.int64), image=ret_img, beta=25)
type = []
for each in adata.obs["color"]:
if each < adata.obs["color"].quantile(0.2):
r = "yes"
type.append(r)
else:
r = "no"
type.append(r)
adata.obs['Goblet_GE'] = type
fig, ax = plt.subplots(figsize=(10, 10)) # Adjust the size as needed
ax.imshow(img)
ax.set_axis_off()
sc.pl.scatter(adata, x='pixel_y', y='pixel_x', color='Goblet_GE', ax=ax, size = 150, title='Goblet GE Spot Annotations')
# Save the figure
plt.savefig('./sample_results/Goblet_spot_GE.png', dpi=300, bbox_inches='tight')
圖像分割Segmentation
plot_dir="/rsrch4/home/genomic_med/jjiang6/Project1/S1_54078/Segmentation/NC_review_Goblet_seg/"
save_dir=plot_dir+"/seg_results"
adata= sc.read("/rsrch4/home/genomic_med/jjiang6/Project1/S1_54078/TESLA/54078_data.h5ad")
img_path = '/rsrch4/home/genomic_med/jjiang6/Project1/S1_54078/1415785-6 Bx2.tif'
img = tiff.imread(img_path)
d0, d1= img.shape[0], img.shape[1]
#=====================================Split into patched=====================================================
patch_size=400
patches=patch_split_for_ST(img=img, patch_size=patch_size, spot_info=adata.obs, x_name="pixel_x", y_name="pixel_y")
patch_info=adata.obs
# save results
pickle.dump(patches, open(plot_dir + 'patches.pkl', 'wb'))
#=================================Image Segmentation===================================
meti.Segment_Patches(patches, save_dir=save_dir,n_clusters=10)
#=================================Get masks=================================#
pred_file_locs=[save_dir+"/patch"+str(j)+"_pred.npy" for j in range(patch_info.shape[0])]
dic_list=meti.get_color_dic(patches, seg_dir=save_dir)
masks_index=meti.Match_Masks(dic_list, num_mask_each=5, mapping_threshold1=30, mapping_threshold2=60)
masks=meti.Extract_Masks(masks_index, pred_file_locs, patch_size)
combined_masks=meti.Combine_Masks(masks, patch_info, d0, d1)
#=================================Plot masks=================================#
plot_dir = '../tutorial/data/seg_results/mask'
for i in range(masks.shape[0]): #Each mask
print("Plotting mask ", str(i))
ret=(combined_masks[i]*255)
cv2.imwrite(plot_dir+'/mask'+str(i)+'.png', ret.astype(np.uint8))
#=================================Choose one mask to detect cells/nucleis=================================#
channel=1
converted_image = combined_masks[1].astype(np.uint8)
ret, labels = cv2.connectedComponents(converted_image)
features=meti.Extract_CC_Features_each_CC(labels)
num_labels = labels.max()
height, width = labels.shape
colors = np.random.randint(0, 255, size=(num_labels + 1, 3), dtype=np.uint8)
colors[0] = [0, 0, 0]
colored_mask = np.zeros((height, width, 3), dtype=np.uint8)
colored_mask = colors[labels]
# save the colored nucleis channel
cv2.imwrite('/rsrch4/home/genomic_med/jjiang6/Project1/S1_54078/Segmentation/NC_review_Goblet_seg/seg_results/goblet.png', colored_mask)
# save nucleis label
np.save('/rsrch4/home/genomic_med/jjiang6/Project1/S1_54078/Segmentation/NC_review_Goblet_seg/seg_results/labels.npy', labels)
# save nucleis features, including, area, length, width
features.to_csv('/rsrch4/home/genomic_med/jjiang6/Project1/S1_54078/Segmentation/NC_review_Goblet_seg/seg_results/features.csv', index=False)
#=================================filter out goblet cells=================================#
plot_dir="../tutorial/data/seg_results/mask"
if not os.path.exists(plot_dir):os.mkdir(plot_dir)
labels=np.load(plot_dir+"labels.npy")
#Filter - different cell type needs to apply different parameter values
features=pd.read_csv(plot_dir+"features.csv", header=0, index_col=0)
features['mm_ratio'] = features['major_axis_length']/features['minor_axis_length']
features_sub=features[(features["area"]>120) &
(features["area"]<1500) &
(features["solidity"]>0.85) &
(features["mm_ratio"]<2)]
index=features_sub.index.tolist()
labels_filtered=labels*np.isin(labels, index)
np.save(plot_dir+"nuclei_filtered.npy", labels_filtered)
num_labels = labels_filtered.max()
height, width = labels_filtered.shape
colors = np.random.randint(0, 255, size=(num_labels + 1, 3), dtype=np.uint8)
colors[0] = [0, 0, 0]
colored_mask = np.zeros((height, width, 3), dtype=np.uint8)
colored_mask = colors[labels_filtered]
cv2.imwrite(plot_dir+'/goblet_filtered.png', colored_mask)
#=====================================Convert to spot level============================================#
plot_dir="./tutorial/sample_results/"
img_seg = np.load(plot_dir+'nuclei_filtered_white.npy')
adata.obs["color"]=meti.extract_color(x_pixel=(np.array(adata.obs["pixel_x"])).astype(np.int64),
y_pixel=(np.array(adata.obs["pixel_y"])).astype(np.int64), image=img_seg, beta=49)
type = []
for each in adata.obs["color"]:
if each > 0:
r = "yes"
type.append(r)
else:
r = "no"
type.append(r)
adata.obs['Goblet_seg'] = type
fig, ax = plt.subplots(figsize=(10, 10))
ax.imshow(img)
ax.set_axis_off()
sc.pl.scatter(adata, x='pixel_y', y='pixel_x', color='Goblet_seg', ax=ax, size = 150, title='Goblet Segmentation Spot Annotations')
# Save the figure
plt.savefig(plot_dir+'Goblet_spot_seg.png', format='png', dpi=300, bbox_inches='tight')
Integrarion of gene expression result with segmentation result
adata.obs['Goblet_combined'] = np.where((adata.obs['Goblet_seg'] == 'yes') | (adata.obs['Goblet_GE'] == 'yes'), 'yes', 'no')
fig, ax = plt.subplots(figsize=(10, 10))
ax.imshow(img)
ax.set_axis_off()
sc.pl.scatter(adata, x='pixel_y', y='pixel_x', color='Goblet_combined', ax=ax, size = 150,title='Goblet Combined Spot Annotations')
# Save the figure
plt.savefig(plot_dir+'Goblet_spot_combined.png', format='png', dpi=300, bbox_inches='tight')