作者雪标,追風(fēng)少年i
國慶前的最后一周了,大家好好努力劳吠,有句臺詞說得好,Yesterday is history巩趁, Tomorrow is a mystery痒玩, But today is a gift That is why it’s called the present (the gift) 。
這一篇來回答一些項目中遇到的問題议慰,以及簡單分享一下如何衡量樣本間的異質(zhì)性蠢古。
最近很多老師來做分析,其中大多數(shù)拿著疾病的樣本别凹,然后告訴我做什么分析草讶,要分析什么老師也不知道,只是說做分析炉菲,搞得我難以下手堕战,簡單的課題設(shè)計還是要準(zhǔn)備好的坤溃。
疾病樣本:其實單細胞樣本的分析本質(zhì)還是分組找差異,只是這個差異從更高的維度來分析嘱丢,無論是通訊的差異還是轉(zhuǎn)錄狀態(tài)的差異薪介,前提是有組可分,最理想的狀態(tài)就是normal和disease越驻,當(dāng)然很多老師無法取到normal的樣本汁政,那么臨床信息就顯得尤為重要,預(yù)后好和預(yù)后差之間的差異伐谈,才具有臨床指導(dǎo)的價值烂完。
如果說僅僅有疾病樣本,臨床信息也沒有的前提下诵棵,是很難分析出有效的信息抠蚣,這個時候很多老師就開始探索單細胞數(shù)據(jù)庫,從數(shù)據(jù)庫的隊列中尋找normal的數(shù)據(jù)來進行分析履澳,方法是可取的嘶窄,但是要注意數(shù)據(jù)的匹配程度,平臺差異等等距贷。
樣本的注釋問題柄冲,流程化的注釋是不可取的,多次強調(diào)過忠蝗,市面上都說新格元注釋做的好现横,也是浪費了很大的人工,并不是流程化帶來的阁最,資料和經(jīng)驗戒祠,也是科研工作者必備的素質(zhì)。
樣本量的問題速种,單細胞發(fā)展到現(xiàn)在姜盈,僅靠一兩個樣本發(fā)文章是不現(xiàn)實的(二區(qū)及以上),無論從哪個角度分析配阵,都需要在多個樣本中驗證研究的價值馏颂,所以大家如果決定做單細胞分析,就要想好這一點棋傍。
好了救拉,關(guān)于課題設(shè)計,其實也是一個很大的學(xué)問瘫拣,這樣的工作通常是技術(shù)支持來做的亿絮,接下來分享一個簡單的內(nèi)容,衡量樣本間的異質(zhì)性。
- 指標(biāo)1:Cluster entropy壹无,為了測量來自不同樣本的細胞在細胞類型cluster中的混合程度如何葱绒,量化了數(shù)據(jù)集的歸一化相對cluster entropy,按cluster大小加權(quán)斗锭。 cluster entropy為 1 表示樣本在cluster 之間完全混合地淀。
參考文章,10X單細胞(10X空間轉(zhuǎn)錄組)基因表達的熵值分析
- 指標(biāo)2 :Similarity scores/alignment岖是,為了測量來自相同與不同批次和/或樣品的細胞之間細胞類型內(nèi)細胞狀態(tài)的轉(zhuǎn)錄變異帮毁,測量了每個樣品/批次之間的成對對齊(就是整合),其中批次由同一天處理的樣品組組成豺撑。 這個“相似度分?jǐn)?shù)”檢查特定樣本/批次中每個細胞的局部鄰域烈疚,詢問其 k 個最近鄰居中有多少個(在 PC 或 iNMF(這個大家應(yīng)該都知道) 空間中)屬于第二個樣本/批次,然后在所有細胞上取平均值 . 這里選擇 k 為cluster內(nèi)細胞總數(shù)的 1%聪轿。 結(jié)果通過每個樣品/批次的預(yù)期細胞數(shù)進行標(biāo)準(zhǔn)化爷肝。
最后匯總一下Seurat包的所有函數(shù)
函數(shù) | 作用 |
---|---|
AddModuleScore | Calculate module scores for feature expression programs in single cells |
AggregateExpression | Aggregated feature expression by identity class |
AnchorSet-class | The AnchorSet Class |
AnnotateAnchors | Add info to anchor matrix |
Assay-class | The Assay Class |
AugmentPlot | Augments ggplot2-based plot with a PNG image. |
AverageExpression | Averaged feature expression by identity class |
BGTextColor | Determine text color based on background color |
BarcodeInflectionsPlot | Plot the Barcode Distribution and Calculated Inflection Points |
BlackAndWhite | Create a custom color palette |
BuildClusterTree | Phylogenetic Analysis of Identity Classes |
CalcPerturbSig | Calculate a perturbation Signature |
CalculateBarcodeInflections | Calculate the Barcode Distribution Inflection |
CaseMatch | Match the case of character vectors |
CellCycleScoring | Score cell cycle phases |
CellScatter | Cell-cell scatter plot |
CellSelector | Cell Selector |
Cells.SCTModel | Get Cell Names |
CellsByImage | Get a vector of cell names associated with an image (or set of images) |
CollapseEmbeddingOutliers | Move outliers towards center on dimension reduction plot |
CollapseSpeciesExpressionMatrix | Slim down a multi-species expression matrix,when only one species is primarily of interenst. |
ColorDimSplit | Color dimensional reduction plot by tree split |
CombinePlots | Combine ggplot2-based plots into a single plot |
CreateSCTAssayObject | Create a SCT Assay object |
CustomDistance | Run a custom distance function on an input data matrix |
DEenrichRPlot | DE and EnrichR pathway visualization barplot |
DietSeurat | Slim down a Seurat object |
DimHeatmap | Dimensional reduction heatmap |
DimPlot | Dimensional reduction plot |
DimReduc-class | The DimReduc Class |
DiscretePalette | Discrete colour palettes from the pals package |
DoHeatmap | Feature expression heatmap |
DotPlot | Dot plot visualization |
ElbowPlot | Quickly Pick Relevant Dimensions |
ExpMean | Calculate the mean of logged values |
ExpSD | Calculate the standard deviation of logged values |
ExpVar | Calculate the variance of logged values |
FastRowScale | Scale and/or center matrix rowwise |
FeaturePlot | Visualize 'features' on a dimensional reduction plot |
FeatureScatter | Scatter plot of single cell data |
FilterSlideSeq | Filter stray beads from Slide-seq puck |
FindAllMarkers | Gene expression markers for all identity classes |
FindClusters | Cluster Determination |
FindConservedMarkers | Finds markers that are conserved between the groups |
FindIntegrationAnchors | Find integration anchors |
FindMarkers | Gene expression markers of identity classes |
FindMultiModalNeighbors | Construct weighted nearest neighbor graph |
FindNeighbors | (Shared) Nearest-neighbor graph construction |
FindSpatiallyVariableFeatures | Find spatially variable features |
FindSubCluster | Find subclusters under one cluster |
FindTransferAnchors | Find transfer anchors |
FindVariableFeatures | Find variable features |
FoldChange | Fold Change |
GetAssay | Get an Assay object from a given Seurat object. |
GetImage.SlideSeq | Get Image Data |
GetIntegrationData | Get integration data |
GetResidual | Calculate pearson residuals of features not in the scale.data |
GetTissueCoordinates.SlideSeq | Get Tissue Coordinates |
GetTransferPredictions | Get the predicted identity |
Graph-class | The Graph Class |
GroupCorrelation | Compute the correlation of features broken down by groups with another covariate |
GroupCorrelationPlot | Boxplot of correlation of a variable (e.g.number of UMIs) with expression data |
HTODemux | Demultiplex samples based on data from cell 'hashing' |
HTOHeatmap | Hashtag oligo heatmap |
HVFInfo.SCTAssay | Get Variable Feature Information |
HoverLocator | Hover Locator |
IFeaturePlot | Visualize features in dimensional reduction space interactively |
ISpatialDimPlot | Visualize clusters spatially and interactively |
ISpatialFeaturePlot | Visualize features spatially and interactively |
IntegrateData | Integrate data |
IntegrateEmbeddings | Integrate low dimensional embeddings |
IntegrationAnchorSet-class | The IntegrationAnchorSet Class |
IntegrationData-class | The IntegrationData Class |
JackStraw | Determine statistical significance of PCA scores. |
JackStrawData-class | The JackStrawData Class |
JackStrawPlot | JackStraw Plot |
L2CCA | L2-Normalize CCA |
L2Dim | L2-normalization |
LabelClusters | Label clusters on a ggplot2-based scatter plot |
LabelPoints | Add text labels to a ggplot2 plot |
LinkedPlots | Visualize spatial and clustering (dimensional reduction) data in a linked, interactive framework |
Load10X_Spatial | Load a 10x Genomics Visium Spatial Experiment into a 'Seurat' object |
LoadAnnoyIndex | Load the Annoy index file |
LoadSTARmap | Load STARmap data |
LocalStruct | Calculate the local structure preservation metric |
LogNormalize | Normalize raw data |
LogVMR | Calculate the variance to mean ratio of logged values |
MULTIseqDemux | Demultiplex samples based on classification method from MULTI-seq |
MapQuery | Map query cells to a reference |
MappingScore | Metric for evaluating mapping success |
MetaFeature | Aggregate expression of multiple features into a single feature |
MinMax | Apply a ceiling and floor to all values in a matrix |
MixingMetric | Calculates a mixing metric |
MixscapeHeatmap | Differential expression heatmap for mixscape |
MixscapeLDA | Linear discriminant analysis on pooled CRISPR screen data. |
ModalityWeights-class | The ModalityWeights Class |
NNPlot | Highlight Neighbors in DimPlot |
Neighbor-class | The Neighbor Class |
NormalizeData | Normalize Data |
PCASigGenes | Significant genes from a PCA |
PercentageFeatureSet | Calculate the percentage of all counts that belong to a given set of features |
PlotClusterTree | Plot clusters as a tree |
PlotPerturbScore | Function to plot perturbation score distributions. |
PolyDimPlot | Polygon DimPlot |
PolyFeaturePlot | Polygon FeaturePlot |
PredictAssay | Predict value from nearest neighbors |
PrepLDA | Function to prepare data for Linear Discriminant Analysis. |
PrepSCTIntegration | Prepare an object list normalized with sctransform for integration. |
ProjectDim | Project Dimensional reduction onto full dataset |
ProjectUMAP | Project query into UMAP coordinates of reference |
Radius.SlideSeq | Get Spot Radius |
Read10X | Load in data from 10X |
Read10X_Image | Load a 10X Genomics Visium Image |
Read10X_h5 | Read 10X hdf5 file |
ReadMtx | Load in data from remote or local mtx files |
ReadSlideSeq | Load Slide-seq spatial data |
RegroupIdents | Regroup idents based on meta.data info |
RelativeCounts | Normalize raw data to fractions |
RenameCells.SCTAssay | Rename Cells in an Object |
RidgePlot | Single cell ridge plot |
RunCCA | Perform Canonical Correlation Analysis |
RunICA | Run Independent Component Analysis on gene expression |
RunLDA | Run Linear Discriminant Analysis |
RunMarkVario | Run the mark variogram computation on a given position matrix and expression matrix. |
RunMixscape | Run Mixscape |
RunMoransI | Compute Moran's I value. |
RunPCA | Run Principal Component Analysis |
RunSPCA | Run Supervised Principal Component Analysis |
RunTSNE | Run t-distributed Stochastic Neighbor Embedding |
RunUMAP | Run UMAP |
SCTAssay-class | The SCTModel Class |
SCTResults | Get SCT results from an Assay |
SCTransform | Use regularized negative binomial regression to normalize UMI count data |
STARmap-class | The STARmap class |
SampleUMI | Sample UMI |
SaveAnnoyIndex | Save the Annoy index |
ScaleData | Scale and center the data. |
ScaleFactors | Get image scale factors |
ScoreJackStraw | Compute Jackstraw scores significance. |
SelectIntegrationFeatures | Select integration features |
SetIntegrationData | Set integration data |
Seurat-class | The Seurat Class |
Seurat-package | Seurat: Tools for Single Cell Genomics |
SeuratCommand-class | The SeuratCommand Class |
SeuratTheme | Seurat Themes |
SlideSeq-class | The SlideSeq class |
SpatialImage-class | The SpatialImage Class |
SpatialPlot | Visualize spatial clustering and expression data. |
SplitObject | Splits object into a list of subsetted objects. |
SubsetByBarcodeInflections | Subset a Seurat Object based on the Barcode Distribution Inflection Points |
TopCells | Find cells with highest scores for a given dimensional reduction technique |
TopFeatures | Find features with highest scores for a given dimensional reduction technique |
TopNeighbors | Get nearest neighbors for given cell |
TransferAnchorSet-class | The TransferAnchorSet Class |
TransferData | Transfer data |
UpdateSCTAssays | Update pre-V4 Assays generated with SCTransform in the Seurat to the new SCTAssay class |
UpdateSymbolList | Get updated synonyms for gene symbols |
VariableFeaturePlot | View variable features |
VisiumV1-class | The VisiumV1 class |
VizDimLoadings | Visualize Dimensional Reduction genes |
VlnPlot | Single cell violin plot |
as.CellDataSet | Convert objects to CellDataSet objects |
as.Seurat.CellDataSet | Convert objects to 'Seurat' objects |
as.SingleCellExperiment | Convert objects to SingleCellExperiment objects |
as.sparse.H5Group | Cast to Sparse |
cc.genes | Cell cycle genes |
cc.genes.updated.2019 | Cell cycle genes: 2019 update |
contrast-theory | Get the intensity and/or luminance of a color |
merge.SCTAssay | Merge SCTAssay objects |
subset.AnchorSet | Subset an AnchorSet object |
周一了,偷個懶陆错,生活很好灯抛,有你更好