舊號(hào)無(wú)故被封宁昭,小號(hào)再發(fā)一次
更多空間轉(zhuǎn)錄組文章:
1. 新版10X Visium
- 【10X空間轉(zhuǎn)錄組Visium】(一)Space Ranger 1.0.0(更新于20191205)
- 【10X空間轉(zhuǎn)錄組Visium】(二)Loupe Browser 4.0.0
- 【10X空間轉(zhuǎn)錄組Visium】(三)跑通Visium全流程記錄
- 【10X空間轉(zhuǎn)錄組Visium】(四)R下游分析的探索性代碼示例
- 【10X空間轉(zhuǎn)錄組Visium】(五)Visium原理踪栋、流程與產(chǎn)品
- 【10X空間轉(zhuǎn)錄組Visium】(六)新版Seurat v3.2分析Visium空間轉(zhuǎn)錄組結(jié)果的代碼實(shí)操
- 【10X空間轉(zhuǎn)錄組Visium】(七)思考新版Seurat V3.2作者在Github給予的回答
2. 舊版Sptial
- 【舊版空間轉(zhuǎn)錄組Spatial】(一)ST Spot Detector使用指南
- 【舊版空間轉(zhuǎn)錄組Spatial】(二)跑通流程試驗(yàn)記錄
- 【舊版空間轉(zhuǎn)錄組Spatial】(三)ST Spot Detector實(shí)操記錄
下載數(shù)據(jù)集
https://support.10xgenomics.com/spatial-gene-expression/datasets
我選擇的是:Mouse Brain Section (Coronal)
$ tar -xvf V1_Adult_Mouse_Brain_fastqs.tar
$ ls
V1_Adult_Mouse_Brain_S5_L001_I1_001.fastq.gz V1_Adult_Mouse_Brain_S5_L001_R2_001.fastq.gz V1_Adult_Mouse_Brain_S5_L002_R1_001.fastq.gz
V1_Adult_Mouse_Brain_S5_L001_I2_001.fastq.gz V1_Adult_Mouse_Brain_S5_L002_I1_001.fastq.gz V1_Adult_Mouse_Brain_S5_L002_R2_001.fastq.gz
V1_Adult_Mouse_Brain_S5_L001_R1_001.fastq.gz V1_Adult_Mouse_Brain_S5_L002_I2_001.fastq.gz
- 同一個(gè)樣本的測(cè)序數(shù)據(jù),這里總共有2條lane
- 每條lane因?yàn)槭请p索引的緣故绵跷,所以存在I1 I2 R1 R2共4個(gè)fastq文件、
-
所以總共有8條fastq
與之對(duì)應(yīng)的情況是:
image.png
運(yùn)行spaceranger count
此處選擇自動(dòng)對(duì)齊的方案
由于服務(wù)器沒(méi)有連接外網(wǎng):所以手動(dòng)下載slide文件
https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/latest/using/count
$ spaceranger count --id=V1_Adult_Mouse_Brain \
--transcriptome=/share/nas1/Data/luohb/Visium/reference/refdata-cellranger-mm10-3.0.0/ \
--fastqs=/share/nas1/Data/luohb/Visium/test2/V1_Adult_Mouse_Brain_fastqs \
--sample=V1_Adult_Mouse_Brain \
--image=/share/nas1/Data/luohb/Visium/test2/V1_Adult_Mouse_Brain_image.tif \
--slide=V19L01-041 \
--area=C1 \
--slidefile=/share/nas1/Data/luohb/Visium/test2/V19L01-041.gpr \
--localcores=32 \
--localmem=128
順利地跑完了方妖,因?yàn)榉?wù)器同時(shí)還跑著幾個(gè)比較大的任務(wù)广凸,然后居然跑了接近13個(gè)小時(shí)典鸡。被廓。。
查看結(jié)果文件
$ ls
_cmdline _finalstate _jobmode _mrosource _perf _sitecheck _tags _uuid _vdrkill
_filelist _invocation _log outs _perf._truncated_ SPATIAL_RNA_COUNTER_CS _timestamp V1_Adult_Mouse_Brain.mri.tgz _versions
$ cd outs/
$ ls
analysis filtered_feature_bc_matrix metrics_summary.csv possorted_genome_bam.bam raw_feature_bc_matrix spatial
cloupe.cloupe filtered_feature_bc_matrix.h5 molecule_info.h5 possorted_genome_bam.bam.bai raw_feature_bc_matrix.h5 web_summary.html
-
查看web_summary.html
image.png
image.png - 查看count管道輸出幾個(gè)包含自動(dòng)二級(jí)分析結(jié)果的CSV文件
$cd analysis/
$ls
clustering diffexp pca tsne umap
1. PCA降維結(jié)果:
$cd /pca/10_components
$ls
components.csv dispersion.csv features_selected.csv projection.csv variance.csv
投影
$head -3 projection.csv
Barcode,PC-1,PC-2,PC-3,PC-4,PC-5,PC-6,PC-7,PC-8,PC-9,PC-10
AAACAAGTATCTCCCA-1,-10.281241313083257,-24.67223115562252,-0.19850052930601336,-2.1734929997144388,6.630976878797487,-0.12128746693282366,6.040708059434257,4.657495740394594,16.344239212184327,6.523601903899456
AAACAATCTACTAGCA-1,17.830458684877186,-27.53526668134934,15.877302377060623,9.74572143694312,-0.7208195934715782,-4.339470398396214,2.5444608437485288,-5.084679351848514,2.9247276185469495,-1.0731021612191327
components matrix
$less -S components.csv
PC,ENSMUSG00000051951,ENSMUSG00000089699,ENSMUSG00000025900,ENSMUSG00000025902,ENSMUSG00000033845,ENSMUSG00000025903,ENSMUSG00000104217,ENSMUSG00000033813,(略……)
1,9.807402710059275e-05,-0.0007359419037463138,0.0018506647696503106,0.0019216677830155664,-0.009477278899046813,-0.005003056852125207,0.0,-0.008498306263180
2,-0.0013017257339919546,0.0015759310908915448,0.0013809836795030965,0.0009513422156874659,0.007418499981929492,0.003222355732773671,0.0,0.00887178686827463,
3,-0.001920230193482586,0.003378841598139873,-0.00012165106820253075,-0.00024897415838216264,-0.0031447165300072175,-0.007787586978438225,0.0,-0.003148852394
(略……)
總方差的比例
$head -3 variance.csv
PC,Proportion.Variance.Explained
1,0.030645967432188836
2,0.015067575203691749
歸一化的離散度
$head -3 dispersion.csv
Feature,Normalized.Dispersion
ENSMUSG00000051951,0.261762717719762
ENSMUSG00000089699,-1.5988672040435437
2. t-SNE結(jié)果文件:
$cd ../../tsne/2_components/
$ls
projection.csv
$head -5 projection.csv
Barcode,TSNE-1,TSNE-2
AAACAAGTATCTCCCA-1,-18.47081216664088,7.240054873818881
AAACAATCTACTAGCA-1,-4.219964329936257,-9.182632464702484
AAACACCAATAACTGC-1,14.744060324279337,13.360913482080413
AAACAGAGCGACTCCT-1,-11.72411901642397,-7.924228663324808
3. 聚類結(jié)果:
$cd ../../clustering/
$ls
graphclust kmeans_2_clusters kmeans_4_clusters kmeans_6_clusters kmeans_8_clusters
kmeans_10_clusters kmeans_3_clusters kmeans_5_clusters kmeans_7_clusters kmeans_9_clusters
對(duì)于每個(gè)聚類萝玷, spaceranger為每個(gè)點(diǎn)生成聚類分配cluster assignments
打開聚類3看看:
$cd kmeans_3_clusters
$ls
clusters.csv
$head -5 clusters.csv
Barcode,Cluster
AAACAAGTATCTCCCA-1,1
AAACAATCTACTAGCA-1,3
AAACACCAATAACTGC-1,2
AAACAGAGCGACTCCT-1,1
4. 差異表達(dá)分析:
$cd ../../diffexp/
$ls
graphclust kmeans_2_clusters kmeans_4_clusters kmeans_6_clusters kmeans_8_clusters
kmeans_10_clusters kmeans_3_clusters kmeans_5_clusters kmeans_7_clusters kmeans_9_clusters
這次看個(gè)總表:
$cd /graphclust
$ls
differential_expression.csv
$head -3 differential_expression.csv
Feature ID,Feature Name,Cluster 1 Mean Counts,Cluster 1 Log2 fold change,Cluster 1 Adjusted p value,Cluster 2 Mean Counts,Cluster 2 Log2 fold change,Cluster 2 Adjusted p value,Cluster 3 Mean Counts,Cluster 3 Log2 fold change,Cluster 3 Adjusted p value,Cluster 4 Mean Counts,Cluster 4 Log2 fold change,Cluster 4 Adjusted p value,Cluster 5 Mean Counts,Cluster 5 Log2 fold change,Cluster 5 Adjusted p value,Cluster 6 Mean Counts,Cluster 6 Log2 fold change,Cluster 6 Adjusted p value,Cluster 7 Mean Counts,Cluster 7 Log2 fold change,Cluster 7 Adjusted p value,Cluster 8 Mean Counts,Cluster 8 Log2 fold change,Cluster 8 Adjusted p value,Cluster 9 Mean Counts,Cluster 9 Log2 fold change,Cluster 9 Adjusted p value
ENSMUSG00000051951,Xkr4,0.09115907843838432,0.15688013442205495,0.9130108472807676,0.08789156406190936,0.094226986457139,1.0,0.059424476860418934,-0.5579910544947899,0.4792687534164091,0.09747791035014447,0.270272692975412,0.7950049780312995,0.08717356987748102,0.14776402072440886,1.0,0.05406634025868632,-0.6310298603360582,0.7980928917515894,0.15030400022885756,0.9570457266970553,0.22931236900985477,0.0606581027791399,-0.4319057525382224,1.0,0.10761817731957228,0.4400508833584902,1.0
ENSMUSG00000089699,Gm1992,0.0016574377897888059,1.3866145310996707,0.8220253607506287,0.0,0.423008752385563,1.0,0.0,0.22991150489664136,1.0,0.0033613072534532575,2.5793194965660433,0.5338242296758853,0.0,2.3542148981918345,1.0,0.003180372956393313,2.490599584065473,0.8676482778053517,0.0,1.5959470345290159,1.0,0.0,1.4568374963600368,1.0,0.0,2.146642828481177,1.0
5 .矩陣:Feature-Barcode Matrices
矩陣的每個(gè)元素是與特征(行)和條形碼(列)關(guān)聯(lián)的UMI的數(shù)量嫁乘。
$cd /share/nas1/Data/luohb/Visium/test2/V1_Adult_Mouse_Brain/outs
$ls
analysis filtered_feature_bc_matrix metrics_summary.csv possorted_genome_bam.bam raw_feature_bc_matrix spatial
cloupe.cloupe filtered_feature_bc_matrix.h5 molecule_info.h5 possorted_genome_bam.bam.bai raw_feature_bc_matrix.h5 web_summary.html
$tree filtered_feature_bc_matrix
filtered_feature_bc_matrix
├── barcodes.tsv.gz
├── features.tsv.gz
└── matrix.mtx.gz
0 directories, 3 files
$tree raw_feature_bc_matrix
raw_feature_bc_matrix
├── barcodes.tsv.gz
├── features.tsv.gz
└── matrix.mtx.gz
0 directories, 3 files
$gzip -cd filtered_feature_bc_matrix/features.tsv.gz |head -3
ENSMUSG00000051951 Xkr4 Gene Expression
ENSMUSG00000089699 Gm1992 Gene Expression
ENSMUSG00000102343 Gm37381 Gene Expression
其中:
第一列 第二列 第三列
功能ID 基因名 標(biāo)識(shí)特征的類型
嘗試將矩陣加載到R
library(Matrix)
matrix_dir = "/share/nas1/Data/luohb/Visium/test2/V1_Adult_Mouse_Brain/outs/filtered_feature_bc_matrix/"
barcode.path <- paste0(matrix_dir, "barcodes.tsv.gz")
features.path <- paste0(matrix_dir, "features.tsv.gz")
matrix.path <- paste0(matrix_dir, "matrix.mtx.gz")
mat <- readMM(file = matrix.path)
feature.names = read.delim(features.path,
header = FALSE,
stringsAsFactors = FALSE)
barcode.names = read.delim(barcode.path,
header = FALSE,
stringsAsFactors = FALSE)
colnames(mat) = barcode.names$V1
rownames(mat) = feature.names$V1
dim(mat)
[1] 31053 2698
嘗試將矩陣加載到Python
import csv
import gzip
import os
import scipy.io
matrix_dir = "/share/nas1/Data/luohb/Visium/test2/V1_Adult_Mouse_Brain/outs/filtered_feature_bc_matrix"
mat = scipy.io.mmread(os.path.join(matrix_dir, "matrix.mtx.gz"))
features_path = os.path.join(matrix_dir, "features.tsv.gz")
feature_ids = [row[0] for row in csv.reader(gzip.open(features_path), delimiter="\t")]
gene_names = [row[1] for row in csv.reader(gzip.open(features_path), delimiter="\t")]
feature_types = [row[2] for row in csv.reader(gzip.open(features_path), delimiter="\t")]
barcodes_path = os.path.join(matrix_dir, "barcodes.tsv.gz")
barcodes = [row[0] for row in csv.reader(gzip.open(barcodes_path), delimiter="\t")]
6. 看圖片
$cd spatial/
$ls
aligned_fiducials.jpg detected_tissue_image.jpg scalefactors_json.json tissue_hires_image.png tissue_lowres_image.png tissue_positions_list.csv
tissue_hires_image.png:較高像素的明場(chǎng)圖片
tissue_lowres_image.png:較低像素的明場(chǎng)圖片
aligned_fiducials.jpg(尺寸與 tissue_hires_image.png相同):用于驗(yàn)證基準(zhǔn)對(duì)齊是否成功
相應(yīng)的像素坐標(biāo)轉(zhuǎn)換文件:scalefactors_json.json
$cat scalefactors_json.json
{"spot_diameter_fullres": 89.44476048022638, "tissue_hires_scalef": 0.17011142, "fiducial_diameter_fullres": 144.48769000651953, "tissue_lowres_scalef": 0.05
PS:這部有點(diǎn)像舊流程的ST_spot_detector的步驟了
其中:
- issue_hires_scalef:將原始全分辨率圖像中的像素位置轉(zhuǎn)換為tissue_hires_image.png中的像素位置的比例因子。
- tissue_lowres_scalef:將原始全分辨率圖像中的像素位置轉(zhuǎn)換為tissue_lowres_image.png中的像素位置的比例因子球碉。
- fiducial_diameter_fullres:跨越原始全分辨率圖像中基準(zhǔn)點(diǎn)直徑的像素?cái)?shù)蜓斧。
- spot_diameter_fullres:跨越原始全分辨率圖像中組織點(diǎn)直徑的像素?cái)?shù)。
detected_tissue_image.jpg:
tissue_positions_list.txt:
$head -2 tissue_positions_list.csv
ACGCCTGACACGCGCT-1,0,0,0,1252,1211
TACCGATCCAACACTT-1,0,1,1,1372,1280
其中列對(duì)應(yīng)著:
- barcode:與該點(diǎn)相關(guān)的條形碼的順序睁冬。
- in_tissue:二進(jìn)制挎春,指示該斑點(diǎn)位于組織的內(nèi)部(1)還是外部(0)。
- array_row:點(diǎn)在陣列中的行坐標(biāo)從0到77豆拨。該陣列有78行直奋。
- array_col:陣列中點(diǎn)的列坐標(biāo)。為了表示 the orange crate arrangement of the spots施禾,此列索引對(duì)偶數(shù)行使用0到126的偶數(shù)脚线,對(duì)奇數(shù)行使用1到127的奇數(shù)。注意弥搞,每行(偶數(shù)或奇數(shù))有64個(gè)斑點(diǎn)邮绿。
- pxl_col_in_fullres:全分辨率圖像中斑點(diǎn)中心的列像素坐標(biāo)。
- pxl_row_in_fullres:全分辨率圖像中斑點(diǎn)中心的行像素坐標(biāo)攀例。
7. BAM:Barcoded BAM
$cd outs/
$samtools view possorted_genome_bam.bam |head -5
A00984:21:HMKLFDMXX:2:2117:10357:1235 16 1 3000100 255 25M199730N72M23S * 0 0 TTTTTTTTTTTTTTTTTTTTTTTTGCAAGAAAAAAAATCAGATAACCGAGGAAAATTATTCATTATGAAGTACTACTTTCCACTTCATTTCATCCCATGTACTCTGCGTTGATACCACTG F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF NH:i:1 HI:i:1 AS:i:83 nM:i:1 RE:A:I xf:i:0 ts:i:21 li:i:0 BC:Z:ACCAGACAAC QT:Z:FFFFFFFFFF CR:Z:GACGACGATCCGCGTT CY:Z:FFFFFFFFFFFFFFFF CB:Z:GACGACGATCCGCGTT-1 UR:Z:CCTGTTTGTTGT UY:Z:FFFFFFFFFFFF UB:Z:CCTGTTTGTTGT RG:Z:V1_Adult_Mouse_Brain:0:1:HMKLFDMXX:2
A00984:21:HMKLFDMXX:1:1306:5041:10034 16 1 3000100 255 25M199611N95M * 0 0 TTTTTTTTTTTTTTTTTTTTTTTTGAAATGACCACAGTGTACTTTATTTAATGATTTTTGTACTTTGTGTTGCAATAAAATAAAAAAAAAATCTACAAAATTCAAATATATAAAATTTCA FFFF:FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i:108 nM:i:0 RE:A:I xf:i:0 li:i:0 BC:Z:ACCAGACAAC QT:Z:FFFFFFFFFF CR:Z:TGGTCTGTTGGGCGTA CY:Z:FFFFFFFFFFFFFFFF CB:Z:TGGTCTGTTGGGCGTA-1 UR:Z:GTTACCCTATGT UY:Z:FFFFFFFFFFFF UB:Z:GTTACCCTATGT RG:Z:V1_Adult_Mouse_Brain:0:1:HMKLFDMXX:1
A00984:21:HMKLFDMXX:2:2345:21206:5087 16 1 3010019 255 98M22S * 0 0 ATAGTGTCCCAGATTTCCTGGCTGTTTCTTGTTAGGATTTTTTTAGATTTAACATTTCTGTCATAGATTAATCTATTTTGCAGATGTAATCCCATGTACTCTGCGTTGATACCACTGCTT F:FFFFFFFFFFF::FFF:FFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFF:FFFFFF NH:i:1 HI:i:1 AS:i:90 nM:i:3 RE:A:I xf:i:0 ts:i:30 li:i:0 BC:Z:ACCAGACAAC QT:Z:FFFFFFFFFF CR:Z:ACGGTCACCGAGACCCY:Z:FFFFFFFFFFFFF,F: CB:Z:ACGGTCACCGAGAACA-1 UR:Z:TCGATCTCGTAA UY:Z:FFFFFFFFFFFF UB:Z:TCGATCTCGTAA RG:Z:V1_Adult_Mouse_Brain:0:1:HMKLFDMXX:2
A00984:21:HMKLFDMXX:1:1164:15980:17738 16 1 3013014 255 17M186702N103M * 0 0 TTTTTTTTTTTTTTTGTTTAAAATGACCACAGTGTACTTTATTTAATGATTTTTGTACTTTGTGTTGCAATAAAATAAAAAAAAAATCTACAAAATTCAAATATATAAAATTTCAAGTTT FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i:108 nM:i:0 RE:A:I xf:i:0 li:i:0 BC:Z:ACCAGACAAC QT:Z:FFF,FFFFFF CR:Z:TCAAGGTTACTACACC CY:Z:FFFFFFFFFFF:FFFF CB:Z:TCAAGGTTACTACACC-1 UR:Z:CCGGGCAGTTAT UY:Z:FFFFFFFFFFFF UB:Z:CCGGGCAGTTAT RG:Z:V1_Adult_Mouse_Brain:0:1:HMKLFDMXX:1
A00984:21:HMKLFDMXX:1:1451:3477:33912 16 1 3013014 255 17M186702N103M * 0 0 TTTTTTTTTTTTTTTGTTTAAAATGACCACAGTGTACTTTATTTAATGATTTTTGTACTTTGTGTTGCAATAAAATAAAAAAAAAATCTACAAAATTCAAATATATAAAATTTCAAGTTT FFFFFFFFFFFFFFFF:FF:FFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF NH:i:1 HI:i:1 AS:i:108 nM:i:0 RE:A:I xf:i:0 li:i:0 BC:Z:ACCAGACAAC QT:Z:FFFFFFFFFF CR:Z:TCAAGGTTACTACACC CY:Z:FFFFFFFFFFF:F,FF CB:Z:TCAAGGTTACTACACC-1 UR:Z:CCGGGCAGTTAT UY:Z:FFFFFFFFFFFF UB:Z:CCGGGCAGTTAT RG:Z:V1_Adult_Mouse_Brain:0:1:HMKLFDMXX:1
貌似沒(méi)看到官網(wǎng)講的AGAATGGTCTGCAT-1
這種spot barcodeCB標(biāo)簽包含帶短劃線分隔符的后綴船逮,后跟數(shù)字的結(jié)構(gòu)啊。粤铭。傻唾。
進(jìn)行R的下游分析
由于現(xiàn)在還沒(méi)有現(xiàn)成的用于10X Visium空間轉(zhuǎn)錄組的R包,只好參考官網(wǎng)的R代碼
官網(wǎng)地址:https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/latest/rkit
通過(guò)Loupe Browser 4.0.0進(jìn)行下游分析
- 打開Xftp,打開
cloupe.cloupe
image.png -
查看tSNE
image.png -
UMAP
image.png -
Feacture Plot
image.png
Feature Plot視圖可讓您可視化每個(gè)點(diǎn)的一個(gè)或兩個(gè)基因的表達(dá)水平冠骄。此視圖使得根據(jù)一個(gè)或兩個(gè)基因的表達(dá)水平對(duì)點(diǎn)組進(jìn)行閾值化變得容易。特征(在這種情況下為基因)可以在Y軸頂部或X軸右側(cè)的文本框中輸入加袋。這些選擇器還包含一個(gè)控件凛辣,用于在線性和對(duì)數(shù)刻度之間切換軸的刻度。
image.png