??scRNA矩陣存儲(chǔ)的文件格式有10X單細(xì)胞測(cè)序數(shù)據(jù)佑女、h5华望、h5ad及刻、loom:10X單細(xì)胞測(cè)序數(shù)據(jù)經(jīng)過(guò)cellranger
處理后會(huì)得到矩陣的三個(gè)文件:matrix.mtx渡处、barcodes.tsv 和genes.tsv施无;h5、h5ad常見(jiàn)于表達(dá)矩陣及注釋信息的存儲(chǔ)喘沿;loom格式更常見(jiàn)于RNA速率(velocyto)闸度、轉(zhuǎn)錄因子(SCENIC)分析。
1蚜印、10X單細(xì)胞測(cè)序數(shù)據(jù)
library(Seuart)
list.files('case1/filtered_feature_bc_matrix')
[1] "barcodes.tsv.gz" "features.tsv.gz" "matrix.mtx.gz"
count <- Read10X('case1/filtered_feature_bc_matrix')
obj <- CreateSeuratObject(counts = count, min.cells = 3, min.features = 100, project = "case1")
obj
An object of class Seurat
21966 features across 3267 samples within 1 assay
Active assay: RNA (21966 features, 0 variable features)
2莺禁、h5
- cellranger生成的h5
count <- Read10X_h5('case1/filtered_feature_bc_matrix.h5')
obj <- CreateSeuratObject(counts = count, min.cells = 3, min.features = 100, project = "case1")
- 普通h5
library(dior)
obj <- read_h5('fibo_rds.h5')
obj
An object of class Seurat
73202 features across 4257 samples within 2 assays
Active assay: RNA (36601 features, 0 variable features)
1 other assay present: counts
3 dimensional reductions calculated: pca, tsne, umap
3、h5ad
??read_h5ad
函數(shù)需要依賴(lài)python
的包scanpy
窄赋、diopy
哟冬,使用前確保這兩個(gè)包已經(jīng)安裝好,否則先安裝一下:pip install scanpy diopy
寝凌。
library(dior)
obj <- read_h5ad('global_raw.h5ad', target.object = "seurat", assay.name = "RNA")
obj
An object of class Seurat
33538 features across 486134 samples within 1 assay
Active assay: RNA (33538 features, 0 variable features)
2 dimensional reductions calculated: pca, umap
??diopy
是python
版的dior
柒傻,安裝后可以在命令行直接使用:scdior --help
查看軟件參數(shù),根據(jù)提示來(lái)使用较木。
4红符、loom
library(SCopeLoomR)
library(Seurat)
fibo_loom <- connect("fibo_count.loom")
count <- t(fibo_loom[['matrix']][,])
colnames(count) <- fibo_loom[['col_attrs']][['CellID']][]
rownames(count) <- fibo_loom[['row_attrs']][['Gene']][]
obj <- CreateSeuratObject(counts = count, min.cells = 3, min.features = 100, project = "case1")
obj
An object of class Seurat
21114 features across 4257 samples within 1 assay
Active assay: RNA (21114 features, 0 variable features)
??R包loomR
也可以用來(lái)處理loom
文件,安裝devtools::install_github("mojaveazure/loomR", ref="develop")
伐债,感興趣的可以自行嘗試预侯。
5、dior
??前面提到這個(gè)R包的兩個(gè)功能峰锁,這里展示一下該包所有的功能萎馅,一個(gè)函數(shù)對(duì)應(yīng)一個(gè)功能,基本上可以通過(guò)名稱(chēng)知道函數(shù)的用途虹蒋。
library(dior)
ls('package:dior')
[1] "df_to_h5" "h5_to_df" "h5_to_matrix" "matrix_to_h5"
[5] "read_h5" "read_h5ad" "read_h5part" "seurat_write_h5"
[9] "write_h5"
6糜芳、sceasy
??這個(gè)R包也可以用于數(shù)據(jù)格式的轉(zhuǎn)化飒货,實(shí)際使用過(guò)程只需使用convertFormat
函數(shù)即可,參數(shù)from = c("anndata", "seurat", "sce", "loom")
指定了原始的格式峭竣,to = c("anndata", "loom", "sce", "seurat", "cds")
指定需要轉(zhuǎn)換為的格式塘辅,可以轉(zhuǎn)換的格式組合見(jiàn)下面列表。
devtools::install_github("cellgeni/sceasy")
library(sceasy)
grep('2',ls(asNamespace('sceasy')), value=T)
[1] "anndata2cds" "anndata2seurat" "loom2anndata" "loom2sce"
[5] "sce2anndata" "sce2loom" "seurat2anndata" "seurat2sce"
convertFormat(obj, from='seurat', to='anndata', outFile='fibo.h5ad')
??這種轉(zhuǎn)換可以是數(shù)據(jù)對(duì)象到文件的轉(zhuǎn)換皆撩,也可以是文件到文件的轉(zhuǎn)換扣墩。不過(guò),這個(gè)包使用起來(lái)好像不是那么友好扛吞,比如上面從seurat
對(duì)象想轉(zhuǎn)換為anndata
格式就沒(méi)有成功呻惕,并且函數(shù)也沒(méi)有幫助信息,github
上面也是簡(jiǎn)單的介紹滥比。
往期回顧
ggplot2 | 開(kāi)發(fā)自己的畫(huà)圖函數(shù)
R包安裝的4種姿勢(shì)
clusterProfiler: No gene can be mapped | 怎么破亚脆?
R語(yǔ)言的碎碎念
linux入門(mén)學(xué)習(xí)指南