Seurat Weekly NO.0 || 開刊詞

作為紀念開個專欄:每周精選Seurat社區(qū)有趣的問答以饗國內(nèi)Seurat的用戶吝岭。

為什么要做這個事情呢寞射?

  • 翻譯學(xué)習(xí)法
  • 認知學(xué)習(xí)法
  • 通勤學(xué)習(xí)法
  • 追蹤工具動態(tài)
  • 相信我,你并不孤獨

因為國內(nèi)的單細胞數(shù)據(jù)分析人員越來越多绰咽,而Seurat為單細胞數(shù)據(jù)分析提供了一個很好的思考框架粹污,當(dāng)然Seurat發(fā)表的文章有很多。大家在應(yīng)用這個單細胞數(shù)據(jù)分析工具的時候不免會遇到這樣或那樣的問題捏检,這雖然是讓人苦惱的荞驴,但是以為偉人說過:好的問題要比好的答案價值百倍。于是贯城,我在Seurat的github上watch了所有的問題熊楼,所以我的個人郵箱是這樣的:

所以在通勤路上看看大家都踩了Seurat的哪些坑,Seurat社區(qū)成員又是如何回復(fù)的能犯,不能不說是一件有意思的事兒鲫骗。何況還能學(xué)習(xí)一下英語啦_.

計劃是每周精選不多于10條的Issues犬耻,在翻譯和復(fù)現(xiàn)這些issue的時候也會摻雜一些自己對Seurat的學(xué)習(xí)心得,以促進Seurat學(xué)者了解這個項目执泰。

目前github上面大部分的交流用的還是英語枕磁,我們?yōu)槭裁床挥脻h語在上面交流呢?那樣就達不到學(xué)習(xí)英語的目的了吧术吝。

凡事善始者實繁计济,克終者蓋寡。Seurat Weekly能做幾期呢顿苇?我們不能承諾峭咒,畢竟是一項無人資助的事業(yè),看緣分呢纪岁。

最為NO.0凑队,今天主要探索一下選題與排版的一般規(guī)則。

選題的第一個規(guī)則就是某個話題出現(xiàn)的頻率幔翰,我們相信一個有很多人愿意討論的話題更大概率是一個好問題漩氨。其次是主編本人覺得有意思的問題。

至于排版遗增,我們按照一般的FAQ的形式給出叫惊,問題部分我們給出原文描述,答案就是主編在看過別人的回答之后做修,夾雜著個人觀點的品論霍狰。

每個問題,我們也給出在github的鏈接饰及,方便讀者朋友就原問題進行解答蔗坯。

下面走幾個例子:

Getting parameters from pre-existing reduction (umap) (#3053)

github地址: https://github.com/satijalab/seurat/issues/3053

這位oh111 的意思應(yīng)該是手里有了Seurat對象,想知道這個對象之前的作者是如何操作的燎含。其實Seurat是給了每個操作過程的參數(shù)記錄的:

就拿我們seurat自帶的數(shù)據(jù)集來看吧:

Seurat::pbmc_small@commands

這返回的是一個list宾濒,他說想看PCA的參數(shù),其實這樣就可以了:

Seurat::pbmc_small@commands$RunPCA.RNA
Command: RunPCA(object = pbmc_small, features = VariableFeatures(object = pbmc_small),     verbose = FALSE)
Time: 2018-08-28 04:34:56
assay : RNA 
features : PPBP IGLL5 VDAC3 CD1C AKR1C3 PF4 MYL9 GNLY TREML1 CA2 SDPR PGRMC1 S100A8 TUBB1 HLA-DQA1 PARVB RUFY1 HLA-DPB1 RP11-290F20.3 S100A9 
compute.dims : 20 
rev.pca : FALSE 
weight.by.var : TRUE 
verbose : FALSE 
print.dims : 1 2 3 4 5 
features.print : 30 
reduction.name : pca 
reduction.key : PC 
seed.use : 42 

Reorder cells by expression value in DoHeatmap & Dendogram (#3036)

github 地址:https://github.com/satijalab/seurat/issues/3036

這是一個可視化細節(jié)的問題屏箍,有時候我們用默認參數(shù)出來的圖并不能直接達到發(fā)表的水平绘梦,或者自己有些小心思想改一下這個圖,最常見的就是該標簽的順序赴魁。其實我們知道這個一般是通過因子變量的levels來控制的卸奉。

有位仁兄給出了這個鏈接: https://stackoverflow.com/questions/52136211/how-to-reorder-cells-in-doheatmap-plot-in-seurat-ggplot2

那么我們來看一下Seurat的繪圖是如何實現(xiàn)的,看DoHeatmapd的源代碼:

 DoHeatmap
function (object, features = NULL, cells = NULL, group.by = "ident", 
    group.bar = TRUE, group.colors = NULL, disp.min = -2.5, disp.max = NULL, 
    slot = "scale.data", assay = NULL, label = TRUE, size = 5.5, 
    hjust = 0, angle = 45, raster = TRUE, draw.lines = TRUE, 
    lines.width = NULL, group.bar.height = 0.02, combine = TRUE) 
{
    cells <- cells %||% colnames(x = object)
    if (is.numeric(x = cells)) {
        cells <- colnames(x = object)[cells]
    }
    assay <- assay %||% DefaultAssay(object = object)
    DefaultAssay(object = object) <- assay
    features <- features %||% VariableFeatures(object = object)
    features <- rev(x = unique(x = features))
    disp.max <- disp.max %||% ifelse(test = slot == "scale.data", 
        yes = 2.5, no = 6)
    possible.features <- rownames(x = GetAssayData(object = object, 
        slot = slot))
    if (any(!features %in% possible.features)) {
        bad.features <- features[!features %in% possible.features]
        features <- features[features %in% possible.features]
        if (length(x = features) == 0) {
            stop("No requested features found in the ", slot, 
                " slot for the ", assay, " assay.")
        }
        warning("The following features were omitted as they were not found in the ", 
            slot, " slot for the ", assay, " assay: ", paste(bad.features, 
                collapse = ", "))
    }
    data <- as.data.frame(x = as.matrix(x = t(x = GetAssayData(object = object, 
        slot = slot)[features, cells, drop = FALSE])))
    object <- suppressMessages(expr = StashIdent(object = object, 
        save.name = "ident"))
    group.by <- group.by %||% "ident"
    groups.use <- object[[group.by]][cells, , drop = FALSE]
    plots <- vector(mode = "list", length = ncol(x = groups.use))
    for (i in 1:ncol(x = groups.use)) {
        data.group <- data
        group.use <- groups.use[, i, drop = TRUE]
        if (!is.factor(x = group.use)) {
            group.use <- factor(x = group.use)
        }
        names(x = group.use) <- cells
        if (draw.lines) {
            lines.width <- lines.width %||% ceiling(x = nrow(x = data.group) * 
                0.0025)
            placeholder.cells <- sapply(X = 1:(length(x = levels(x = group.use)) * 
                lines.width), FUN = function(x) {
                return(RandomName(length = 20))
            })
            placeholder.groups <- rep(x = levels(x = group.use), 
                times = lines.width)
            group.levels <- levels(x = group.use)
            names(x = placeholder.groups) <- placeholder.cells
            group.use <- as.vector(x = group.use)
            names(x = group.use) <- cells
            group.use <- factor(x = c(group.use, placeholder.groups), 
                levels = group.levels)
            na.data.group <- matrix(data = NA, nrow = length(x = placeholder.cells), 
                ncol = ncol(x = data.group), dimnames = list(placeholder.cells, 
                  colnames(x = data.group)))
            data.group <- rbind(data.group, na.data.group)
        }
        lgroup <- length(levels(group.use))
        plot <- SingleRasterMap(data = data.group, raster = raster, 
            disp.min = disp.min, disp.max = disp.max, feature.order = features, 
            cell.order = names(x = sort(x = group.use)), group.by = group.use)
        if (group.bar) {
            default.colors <- c(hue_pal()(length(x = levels(x = group.use))))
            cols <- group.colors[1:length(x = levels(x = group.use))] %||% 
                default.colors
            if (any(is.na(x = cols))) {
                cols[is.na(x = cols)] <- default.colors[is.na(x = cols)]
                cols <- Col2Hex(cols)
                col.dups <- sort(x = unique(x = which(x = duplicated(x = substr(x = cols, 
                  start = 1, stop = 7)))))
                through <- length(x = default.colors)
                while (length(x = col.dups) > 0) {
                  pal.max <- length(x = col.dups) + through
                  cols.extra <- hue_pal()(pal.max)[(through + 
                    1):pal.max]
                  cols[col.dups] <- cols.extra
                  col.dups <- sort(x = unique(x = which(x = duplicated(x = substr(x = cols, 
                    start = 1, stop = 7)))))
                }
            }
            group.use2 <- sort(x = group.use)
            if (draw.lines) {
                na.group <- RandomName(length = 20)
                levels(x = group.use2) <- c(levels(x = group.use2), 
                  na.group)
                group.use2[placeholder.cells] <- na.group
                cols <- c(cols, "#FFFFFF")
            }
            pbuild <- ggplot_build(plot = plot)
            names(x = cols) <- levels(x = group.use2)
            y.range <- diff(x = pbuild$layout$panel_params[[1]]$y.range)
            y.pos <- max(pbuild$layout$panel_params[[1]]$y.range) + 
                y.range * 0.015
            y.max <- y.pos + group.bar.height * y.range
            plot <- plot + annotation_raster(raster = t(x = cols[group.use2]), 
                xmin = -Inf, xmax = Inf, ymin = y.pos, ymax = y.max) + 
                coord_cartesian(ylim = c(0, y.max), clip = "off") + 
                scale_color_manual(values = cols)
            if (label) {
                x.max <- max(pbuild$layout$panel_params[[1]]$x.range)
                x.divs <- pbuild$layout$panel_params[[1]]$x.major
                x <- data.frame(group = sort(x = group.use), 
                  x = x.divs)
                label.x.pos <- tapply(X = x$x, INDEX = x$group, 
                  FUN = median) * x.max
                label.x.pos <- data.frame(group = names(x = label.x.pos), 
                  label.x.pos)
                plot <- plot + geom_text(stat = "identity", data = label.x.pos, 
                  aes_string(label = "group", x = "label.x.pos"), 
                  y = y.max + y.max * 0.03 * 0.5, angle = angle, 
                  hjust = hjust, size = size)
                plot <- suppressMessages(plot + coord_cartesian(ylim = c(0, 
                  y.max + y.max * 0.002 * max(nchar(x = levels(x = group.use))) * 
                    size), clip = "off"))
            }
        }
        plot <- plot + theme(line = element_blank())
        plots[[i]] <- plot
    }
    if (combine) {
        plots <- CombinePlots(plots = plots)
    }
    return(plots)
}
<bytecode: 0x11837bb8>
<environment: namespace:Seurat>

我們發(fā)現(xiàn)Seurat的圖是用ggplot2實現(xiàn)的颖御,這樣我們就可以基于函數(shù)的返回對象來用自己的ggplot功底修復(fù)了:

require(Seurat)
p <- DoHeatmap(pbmc_small)

p$theme
p$layers
p$mapping
 head(p$data)
        Feature           Cell Expression Identity
1        S100A9 ATGCCAGAACGACT -0.7639656        0
2 RP11-290F20.3 ATGCCAGAACGACT -0.3730316        0
3      HLA-DPB1 ATGCCAGAACGACT -1.0399941        0
4         RUFY1 ATGCCAGAACGACT -0.4098329        0
5         PARVB ATGCCAGAACGACT -0.3461794        0
6      HLA-DQA1 ATGCCAGAACGACT -0.6717070        0

Time series dataset in two conditions #3040

github 地址:https://github.com/satijalab/seurat/issues/3040

這是一個常見的問題择卦,多個樣本在許多地方還是值得探討的,這位回答者直接給出了兩篇參考文獻:

I would recommend that you review the Guided tutorial, Multiple dataset integration (specifically the SCT method), and the Stimulated vs Control PBMCs, in that order. Due the large probable size of your final dataset, I think that the manipulations found within the Mouse Cell Atlas Vignette might also prove useful. The VIsualization and Cell cycle regression vignettes were also particularly helpful for our analysis of similar conditions. Lastly, I recommend that you review the methods in Farnsworth et al. 2020 as well as Soldatov et al., 2019 for further help. Hope that this was useful.

Optimal strategy to process samples to compare two different condition. (#3019)

這個其實和上個問題比較像,都是處理多個樣本秉继,這里也有人士給出:

I wouldn't recommend using cellranger aggr as it will downsample everything to a similar number of counts as the sample with the lowest counts, and so potentially will throw out a lot of data. You could quantify each dataset (using cellranger, or another tool like Alevin) and first merge them in Seurat and check if there are batch differences between the datasets. If there are, you could run the integration methods

Deploying Shiny apps using Seurat library to shinyapps.io #2716

github 地址: https://github.com/satijalab/seurat/issues/2716

這是一類問題屬于對Seurat的擴展祈噪,這里是想把Seurat擴展到Shiny程序中,也就是給他界面化尚辑。這要求再開發(fā)者不單要對Seurat的函數(shù)和數(shù)據(jù)對象及其依賴的R包和環(huán)境有所了解辑鲤,還要求他要懂得Shiny的語法和結(jié)構(gòu)。其實這個問題下的討論多是想要彌補Seurat與Shiny的界限杠茬。

好了月褥,今天就到這里,感謝大家的陪伴瓢喉。對了宁赤,你在github上面提問了沒?

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末栓票,一起剝皮案震驚了整個濱河市决左,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌走贪,老刑警劉巖佛猛,帶你破解...
    沈念sama閱讀 206,013評論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異坠狡,居然都是意外死亡继找,警方通過查閱死者的電腦和手機,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,205評論 2 382
  • 文/潘曉璐 我一進店門逃沿,熙熙樓的掌柜王于貴愁眉苦臉地迎上來婴渡,“玉大人,你說我怎么就攤上這事凯亮”呔剩” “怎么了?”我有些...
    開封第一講書人閱讀 152,370評論 0 342
  • 文/不壞的土叔 我叫張陵触幼,是天一觀的道長。 經(jīng)常有香客問我究飞,道長置谦,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 55,168評論 1 278
  • 正文 為了忘掉前任亿傅,我火速辦了婚禮媒峡,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘葵擎。我一直安慰自己谅阿,他們只是感情好,可當(dāng)我...
    茶點故事閱讀 64,153評論 5 371
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著签餐,像睡著了一般寓涨。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上氯檐,一...
    開封第一講書人閱讀 48,954評論 1 283
  • 那天戒良,我揣著相機與錄音,去河邊找鬼冠摄。 笑死糯崎,一個胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的河泳。 我是一名探鬼主播沃呢,決...
    沈念sama閱讀 38,271評論 3 399
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼拆挥!你這毒婦竟也來了薄霜?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 36,916評論 0 259
  • 序言:老撾萬榮一對情侶失蹤竿刁,失蹤者是張志新(化名)和其女友劉穎黄锤,沒想到半個月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體食拜,經(jīng)...
    沈念sama閱讀 43,382評論 1 300
  • 正文 獨居荒郊野嶺守林人離奇死亡鸵熟,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 35,877評論 2 323
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發(fā)現(xiàn)自己被綠了负甸。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片流强。...
    茶點故事閱讀 37,989評論 1 333
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖呻待,靈堂內(nèi)的尸體忽然破棺而出打月,到底是詐尸還是另有隱情,我是刑警寧澤蚕捉,帶...
    沈念sama閱讀 33,624評論 4 322
  • 正文 年R本政府宣布奏篙,位于F島的核電站,受9級特大地震影響迫淹,放射性物質(zhì)發(fā)生泄漏秘通。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點故事閱讀 39,209評論 3 307
  • 文/蒙蒙 一敛熬、第九天 我趴在偏房一處隱蔽的房頂上張望肺稀。 院中可真熱鬧,春花似錦应民、人聲如沸话原。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,199評論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽繁仁。三九已至涉馅,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間改备,已是汗流浹背控漠。 一陣腳步聲響...
    開封第一講書人閱讀 31,418評論 1 260
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留悬钳,地道東北人盐捷。 一個月前我還...
    沈念sama閱讀 45,401評論 2 352
  • 正文 我出身青樓,卻偏偏與公主長得像默勾,于是被迫代替她去往敵國和親碉渡。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點故事閱讀 42,700評論 2 345