1. 首先查看KEGG數(shù)據(jù)庫(kù) PI3K-AKT signaling pathway gene set
詳細(xì)說(shuō)明查看如何拿到 KEGG數(shù)據(jù)庫(kù)的 hsa04650 Natural killer cell mediated cytotoxicity這個(gè)通路的所有基因名字
library(KEGGREST)
listDatabases()#顯示KEGGREST所包含的數(shù)據(jù)內(nèi)容, 可以在進(jìn)一步查詢中使用這些數(shù)據(jù)缎罢。
org <- keggList("organism")
head(org)
gs<-keggGet('hsa04151')
names(gs[[1]]) # 說(shuō)明書(shū)里發(fā)現(xiàn)的哈
kegggenes <- unlist(lapply(gs[[1]]$GENE,function(x) strsplit(x,';')[[1]][1]))[1:length(genes)%%2 ==1]
kegggenes
png <- keggGet("hsa04151", "image")
t <- tempfile()
library(png)
writePNG(png, t)
if (interactive()) browseURL(t)
2. 其次查看reactome數(shù)據(jù)庫(kù) PI3K-AKT signaling pathway gene set
reactome數(shù)據(jù)庫(kù)網(wǎng)址:
https://reactome.org/documentation
輸入pi3k/akt檢索得到:
發(fā)現(xiàn)6條信號(hào)通路與PI3K/AKT存在關(guān)系名扛,我選取了198203/199418/2219528三條,采用reactome.db包進(jìn)行提取雳锋。
## 軟件包含注釋包霹期,615.9MB好大的包包
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("reactome.db")
library(reactome.db)
ls("package:reactome.db")
keytypes(reactome.db)
#看此物件中的資料之欄位名稱
columns(reactome.db)
#直接讀取特定key種類的值
keys(reactome.db, keys ="PATHNAME")
#最后使用keys來(lái)query此annotation database
AnnotationDbi::select(reactome.db, keys = c("6794"), columns = c("PATHID","PATHNAME"), keytypes="ENTREZID") ## 查看單個(gè)基因所在通路
a<- as.list(reactomePATHID2EXTID)$ "R-HSA-198203"
b<- as.list(reactomePATHID2EXTID)$ "R-HSA-199418"
c<- as.list(reactomePATHID2EXTID)$ "R-HSA-2219528"
reagenes <-union(c(a,b), c) ## 取并集
3. 查看交集
intersect(kegggenes, reagenes)
##[1] "1950" "2069" "2246" "2247" "2248" "2249" "8822" "2251" "2252" "2253" "2254" "2255"
##[13] "8823" "2250" "8817" "26281" "27006" "9965" "8074" "4803" "3630" "5154" "5155" "4254"
##[25] "3082" "1956" "2064" "2065" "2066" "2260" "2263" "2261" "2264" "4914" "3643" "5156"
##[37] "5159" "3815" "4233" "2885" "5594" "5595" "3667" "5879" "930" "118788" "5290" "5293"
##[49] "5291" "5295" "5296" "8503" "5170" "7249" "64223" "2475" "6199" "207" "208" "10000"
##[61] "5728" "117145" "5515" "5516" "5519" "5518" "5526" "5527" "5528" "5529" "5525" "23239"
##[73] "23035" "2932" "1026" "1027" "2309" "572" "842" "1385" "3164" "1147" "4193"
setdiff(kegggenes, reagenes) ## 取kegg數(shù)據(jù)庫(kù)中特有元素
etdiff(reagenes, kegggenes) ## 取ReactomeDB數(shù)據(jù)庫(kù)中特有元素
##[1] "387" "8660" "10718" "10818" "145957" "152831" "1839" "2099" "2100" "23396" "2534" "2549"
##[13] "29851" "3084" "3556" "3654" "391" "3932" "4615" "50852" "51135" "5305" "57761" "5781"
##[25] "5880" "6714" "685" "7189" "7409" "79837" "8394" "8395" "8396" "8870" "90865" "9173"
##[37] "9365" "940" "941" "942" "9542" "2308" "253260" "2931" "4303" "55615" "79109" "84335"
基因Id轉(zhuǎn)換
library( "clusterProfiler" )
library( "org.Hs.eg.db" )
df <- bitr( intersect(kegggenes, reagenes), fromType = "ENTREZID", toType = c( "SYMBOL" ), OrgDb = org.Hs.eg.db )
head( df )
## ENTREZID SYMBOL
## 1 1950 EGF
## 2 2069 EREG
## 3 2246 FGF1
## 4 2247 FGF2
## 5 2248 FGF3
## 6 2249 FGF4
從以上可以看到kegg數(shù)據(jù)庫(kù) PI3K-AKT signaling pathway gene set 中基因數(shù)量更多一些,但是reactome數(shù)據(jù)庫(kù) PI3K-AKT signaling pathway gene set 中是已經(jīng)按照信號(hào)通路分類的坯苹,功能方面更具體犀农。
參考文獻(xiàn):
- 信號(hào)通路查詢,除了KEGG你還知道什么?
- 推薦一種簡(jiǎn)單全能的富集分析工具
- kegg富集分析之:KEGGREST包(9大功能)
- KEGG數(shù)據(jù)庫(kù)介紹
- Pathview: An R package for pathway based data integration and visualization
- The Pathway Browser
- 理解Bioconductor系列(二):AnnotationDbi硫眯,決定annotation database的基本結(jié)構(gòu)
全國(guó)巡講第9昂勒、10站-武漢和成都(生信技能樹(shù)爆款入門(mén)課)
1.3個(gè)學(xué)生的linux視頻學(xué)習(xí)筆記
2.生信人應(yīng)該這樣學(xué)R語(yǔ)言系列視頻學(xué)習(xí)心得筆記分享
3.一萬(wàn)人陪你學(xué)習(xí)GEO數(shù)據(jù)庫(kù)挖掘知識(shí)(公益視頻聽(tīng)課筆4.記分享)
4.公共數(shù)據(jù)庫(kù)挖掘視頻學(xué)習(xí)心得體會(huì)
5.生信小技巧系列第一季完結(jié)版視頻教程學(xué)習(xí)筆記分享
6.人類全外顯子測(cè)序數(shù)據(jù)分析視頻教程學(xué)習(xí)筆記
7.B站的11套生物信息學(xué)公益視頻配套講義,練習(xí)題及思維導(dǎo)圖第一彈
8.轉(zhuǎn)錄組測(cè)序數(shù)據(jù)分析公益視頻學(xué)習(xí)筆記分享