【塊】Cibersort.R計(jì)算22種免疫細(xì)胞浸潤(rùn)分?jǐn)?shù)

綜合簡(jiǎn)書作者們的文章以及自己操作，綜合整理如下宦棺，以備后面使用契讲。
有問題歡迎交流
Robust enumeration of cell subsets from tissue expression profiles | Nature Methods
以上為Cibersort的文章
使用Cibersort工具需要三個(gè)文件：
1、Cibersort.R
2冠场、LM22.txt
3秆剪、genes_exp.txt

1赊淑、Cibersort.R

此文件為源代碼，在使用之前請(qǐng)閱讀一下代碼中的注釋段仅讽，安裝一下前置包
在R中創(chuàng)建script陶缺，復(fù)制以下代碼保存為 “Cibersort.R”。
提示：不需要去理解代碼洁灵，直接復(fù)制粘貼饱岸，運(yùn)行就ok了。對(duì)代碼感興趣的話當(dāng)我沒說徽千。

#' CIBERSORT R script v1.03 (last updated 07-10-2015)
#' Note: Signature matrix construction is not currently available; use java version for full functionality.
#' Author: Aaron M. Newman, Stanford University (amnewman@stanford.edu)
#' Requirements:
#'       R v3.0 or later. (dependencies below might not work properly with earlier versions)
#'       install.packages('e1071')
#'       install.pacakges('parallel')
#'       install.packages('preprocessCore')
#'       if preprocessCore is not available in the repositories you have selected, run the following:
#'           source("http://bioconductor.org/biocLite.R")
#'           biocLite("preprocessCore")
#' Windows users using the R GUI may need to Run as Administrator to install or update packages.
#' This script uses 3 parallel processes.  Since Windows does not support forking, this script will run
#' single-threaded in Windows.
#'
#' Usage:
#'       Navigate to directory containing R script
#'
#'   In R:
#'       source('CIBERSORT.R')
#'       results <- CIBERSORT('sig_matrix_file.txt','mixture_file.txt', perm, QN)
#'
#'       Options:
#'       i)  perm = No. permutations; set to >=100 to calculate p-values (default = 0)
#'       ii) QN = Quantile normalization of input mixture (default = TRUE)
#'
#' Input: signature matrix and mixture file, formatted as specified at http://cibersort.stanford.edu/tutorial.php
#' Output: matrix object containing all results and tabular data written to disk 'CIBERSORT-Results.txt'
#' License: http://cibersort.stanford.edu/CIBERSORT_License.txt
#' Core algorithm
#' @param X cell-specific gene expression
#' @param y mixed expression per sample
#' @export
CoreAlg <- function(X, y){
  
  #try different values of nu
  svn_itor <- 3
  
  res <- function(i){
    if(i==1){nus <- 0.25}
    if(i==2){nus <- 0.5}
    if(i==3){nus <- 0.75}
    model<-e1071::svm(X,y,type="nu-regression",kernel="linear",nu=nus,scale=F)
    model
  }
  
  if(Sys.info()['sysname'] == 'Windows') out <- parallel::mclapply(1:svn_itor, res, mc.cores=1) else
    out <- parallel::mclapply(1:svn_itor, res, mc.cores=svn_itor)
  
  nusvm <- rep(0,svn_itor)
  corrv <- rep(0,svn_itor)
  
  #do cibersort
  t <- 1
  while(t <= svn_itor) {
    weights = t(out[[t]]$coefs) %*% out[[t]]$SV
    weights[which(weights<0)]<-0
    w<-weights/sum(weights)
    u <- sweep(X,MARGIN=2,w,'*')
    k <- apply(u, 1, sum)
    nusvm[t] <- sqrt((mean((k - y)^2)))
    corrv[t] <- cor(k, y)
    t <- t + 1
  }
  
  #pick best model
  rmses <- nusvm
  mn <- which.min(rmses)
  model <- out[[mn]]
  
  #get and normalize coefficients
  q <- t(model$coefs) %*% model$SV
  q[which(q<0)]<-0
  w <- (q/sum(q))
  
  mix_rmse <- rmses[mn]
  mix_r <- corrv[mn]
  
  newList <- list("w" = w, "mix_rmse" = mix_rmse, "mix_r" = mix_r)
  
}

#' do permutations
#' @param perm Number of permutations
#' @param X cell-specific gene expression
#' @param y mixed expression per sample
#' @export
doPerm <- function(perm, X, Y){
  itor <- 1
  Ylist <- as.list(data.matrix(Y))
  dist <- matrix()
  
  while(itor <= perm){
    #print(itor)
    
    #random mixture
    yr <- as.numeric(Ylist[sample(length(Ylist),dim(X)[1])])
    
    #standardize mixture
    yr <- (yr - mean(yr)) / sd(yr)
    
    #run CIBERSORT core algorithm
    result <- CoreAlg(X, yr)
    
    mix_r <- result$mix_r
    
    #store correlation
    if(itor == 1) {dist <- mix_r}
    else {dist <- rbind(dist, mix_r)}
    
    itor <- itor + 1
  }
  newList <- list("dist" = dist)
}

#' Main functions
#' @param sig_matrix file path to gene expression from isolated cells
#' @param mixture_file heterogenous mixed expression
#' @param perm Number of permutations
#' @param QN Perform quantile normalization or not (TRUE/FALSE)
#' @export
CIBERSORT <- function(sig_matrix, mixture_file, perm=0, QN=TRUE){
  
  #read in data
  X <- read.table(sig_matrix,header=T,sep="\t",row.names=1,check.names=F)
  Y <- read.table(mixture_file, header=T, sep="\t", row.names=1,check.names=F)
  
  X <- data.matrix(X)
  Y <- data.matrix(Y)
  
  #order
  X <- X[order(rownames(X)),]
  Y <- Y[order(rownames(Y)),]
  
  P <- perm #number of permutations
  
  #anti-log if max < 50 in mixture file
  if(max(Y) < 50) {Y <- 2^Y}
  
  #quantile normalization of mixture file
  if(QN == TRUE){
    tmpc <- colnames(Y)
    tmpr <- rownames(Y)
    Y <- preprocessCore::normalize.quantiles(Y)
    colnames(Y) <- tmpc
    rownames(Y) <- tmpr
  }
  
  #intersect genes
  Xgns <- row.names(X)
  Ygns <- row.names(Y)
  YintX <- Ygns %in% Xgns
  Y <- Y[YintX,]
  XintY <- Xgns %in% row.names(Y)
  X <- X[XintY,]
  
  #standardize sig matrix
  X <- (X - mean(X)) / sd(as.vector(X))
  
  #empirical null distribution of correlation coefficients
  if(P > 0) {nulldist <- sort(doPerm(P, X, Y)$dist)}
  
  #print(nulldist)
  
  header <- c('Mixture',colnames(X),"P-value","Correlation","RMSE")
  #print(header)
  
  output <- matrix()
  itor <- 1
  mixtures <- dim(Y)[2]
  pval <- 9999
  
  #iterate through mixtures
  while(itor <= mixtures){
    
    y <- Y[,itor]
    
    #standardize mixture
    y <- (y - mean(y)) / sd(y)
    
    #run SVR core algorithm
    result <- CoreAlg(X, y)
    
    #get results
    w <- result$w
    mix_r <- result$mix_r
    mix_rmse <- result$mix_rmse
    
    #calculate p-value
    if(P > 0) {pval <- 1 - (which.min(abs(nulldist - mix_r)) / length(nulldist))}
    
    #print output
    out <- c(colnames(Y)[itor],w,pval,mix_r,mix_rmse)
    if(itor == 1) {output <- out}
    else {output <- rbind(output, out)}
    
    itor <- itor + 1
    
  }
  
  #save results
  write.table(rbind(header,output), file="CIBERSORT-Results.txt", sep="\t", row.names=F, col.names=F, quote=F)
  
  #return matrix object containing all results
  obj <- rbind(header,output)
  obj <- obj[,-1]
  obj <- obj[-1,]
  obj <- matrix(as.numeric(unlist(obj)),nrow=nrow(obj))
  rownames(obj) <- colnames(Y)
  colnames(obj) <- c(colnames(X),"P-value","Correlation","RMSE")
  obj
}

2苫费、LM22.txt

此文件為22種免疫細(xì)胞的標(biāo)志基因表達(dá)量，是衡量細(xì)胞含量的標(biāo)準(zhǔn)双抽。
去Cibersort的文章里下載Supplementry table 1百框，下載后打開如下：

Supplementry table 1

只選取如下含有數(shù)據(jù)的部分（其他部分自行探索），如下：

所需數(shù)據(jù)

復(fù)制粘貼為txt文件牍汹，注意籃圈標(biāo)記部分铐维，后面自己文檔的基因列名要與此保持一致柬泽。如下圖：

txt文件

3、gene_exp.txt

此文件是自己的數(shù)據(jù)嫁蛇，在R中處理時(shí)锨并，導(dǎo)出為“sep=\t”的“.txt”文件，需要注意的地方主要有幾點(diǎn)：
1.基因名不能有重復(fù)
2.整個(gè)矩陣不能有空值
3.基因的列名和LM22文件保持一致
4.數(shù)據(jù)格式要和LM22保持一致睬棚，fpkm/tpm不要log處理
我自己的數(shù)據(jù)格式第煮，如下：

我的數(shù)據(jù)

如果有報(bào)錯(cuò)，就把這兩個(gè)表復(fù)制到excel上抑党，去檢查一下數(shù)據(jù)與我這個(gè)Excel文件有什么區(qū)別包警，還有LM22是不是有問題。

轉(zhuǎn)成Excel后

我犯過的問題就有：兩個(gè)表基因列的列名不一致新荤；LM22文件范圍沒選對(duì)揽趾；基因名有重復(fù)和空值出現(xiàn)
這個(gè)腳本報(bào)錯(cuò)信息不詳細(xì)台汇，遇到問題來這里看看苛骨，自己核對(duì)一下

4、最終步驟

將三個(gè)文件放到一個(gè)文件夾苟呐，然后將R當(dāng)前工作目錄轉(zhuǎn)到那個(gè)文件夾（setwd函數(shù)）之后直接輸入以下代碼痒芝，運(yùn)行Cibersort.R，然后等待一段不短的時(shí)間牵素，會(huì)自動(dòng)生成結(jié)果文件”"CIBERSORT-Results.txt"“严衬。如果需要處理多組數(shù)據(jù)，要及時(shí)對(duì)結(jié)果文件重命名笆呆，否則會(huì)重寫為新的分析結(jié)果请琳。

setwd("")
source("Cibersort.R")
result1 <- CIBERSORT("LM22.txt", "genes_exp.txt", perm = 1000, QN = T)
# perm置換次數(shù)=1000，QN分位數(shù)歸一化=TRUE
# 文件名可以自定義

關(guān)于數(shù)據(jù)格式：

fragments per kilobase per million (FPKM) and transcripts per kilobase million (TPM), are suitable for use with CIBERSORT—《Profiling Tumor Infiltrating Immune Cells with CIBERSORT》

最后編輯于：2021.07.29 22:12:19

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末赠幕，一起剝皮案震驚了整個(gè)濱河市俄精，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌榕堰，老刑警劉巖竖慧，帶你破解...
沈念sama閱讀 206,839評(píng)論 6贊 482
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場(chǎng)離奇詭異逆屡，居然都是意外死亡圾旨，警方通過查閱死者的電腦和手機(jī)，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,543評(píng)論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門魏蔗，熙熙樓的掌柜王于貴愁眉苦臉地迎上來砍的，“玉大人，你說我怎么就攤上這事莺治±希” “怎么了味混？”我有些...
開封第一講書人閱讀 153,116評(píng)論 0贊 344
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長(zhǎng)诫惭。經(jīng)常有香客問我翁锡，道長(zhǎng)，這世上最難降的妖魔是什么夕土？我笑而不...
開封第一講書人閱讀 55,371評(píng)論 1贊 279
?港島之戀（遺憾婚禮）
正文為了忘掉前任馆衔，我火速辦了婚禮，結(jié)果婚禮上怨绣，老公的妹妹穿的比我還像新娘角溃。我一直安慰自己，他們只是感情好篮撑，可當(dāng)我...
茶點(diǎn)故事閱讀 64,384評(píng)論 5贊 374
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布减细。她就那樣靜靜地躺著，像睡著了一般赢笨。火紅的嫁衣襯著肌膚如雪未蝌。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 49,111評(píng)論 1贊 285
城市分裂傳說
那天茧妒，我揣著相機(jī)與錄音萧吠，去河邊找鬼。笑死桐筏，一個(gè)胖子當(dāng)著我的面吹牛纸型，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播梅忌，決...
沈念sama閱讀 38,416評(píng)論 3贊 400
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼狰腌，長(zhǎng)吁一口氣：“原來是場(chǎng)噩夢(mèng)啊……” “哼！你這毒婦竟也來了牧氮？” 一聲冷哼從身側(cè)響起琼腔，我...
開封第一講書人閱讀 37,053評(píng)論 0贊 259
萬榮殺人案實(shí)錄
序言：老撾萬榮一對(duì)情侶失蹤，失蹤者是張志新（化名）和其女友劉穎蹋笼，沒想到半個(gè)月后展姐，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 43,558評(píng)論 1贊 300
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡剖毯，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 36,007評(píng)論 2贊 325
?白月光啟示錄
正文我和宋清朗相戀三年圾笨，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片逊谋。...
茶點(diǎn)故事閱讀 38,117評(píng)論 1贊 334
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡擂达，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出胶滋，到底是詐尸還是另有隱情板鬓，我是刑警寧澤悲敷，帶...
沈念sama閱讀 33,756評(píng)論 4贊 324
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站俭令，受9級(jí)特大地震影響后德，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜抄腔，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 39,324評(píng)論 3贊 307
男人毒藥：我在死后第九天來索命
文/蒙蒙一瓢湃、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧赫蛇，春花似錦绵患、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,315評(píng)論 0贊 19
一樁弒父案落蝙，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至暂幼，卻和暖如春筏勒，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背粟誓。一陣腳步聲響...
開封第一講書人閱讀 31,539評(píng)論 1贊 262
情欲美人皮
我被黑心中介騙來泰國(guó)打工奏寨，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留，地道東北人鹰服。一個(gè)月前我還...
沈念sama閱讀 45,578評(píng)論 2贊 355
代替公主和親
正文我出身青樓，卻偏偏與公主長(zhǎng)得像揽咕，于是被迫代替她去往敵國(guó)和親悲酷。傳聞我的和親對(duì)象是個(gè)殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 42,877評(píng)論 2贊 345