1.Start-GEO

下面這個是在關(guān)于在要下載安裝包之前做的切換鏡像,搜索到http和https有什么區(qū)別

一厘擂、HTTP和HTTPS的基本概念

HTTP:是互聯(lián)網(wǎng)上應(yīng)用最為廣泛的一種網(wǎng)絡(luò)協(xié)議昆淡,是一個客戶端和服務(wù)器端請求和應(yīng)答的標(biāo)準(zhǔn)(TCP),用于從WWW服務(wù)器傳輸超文本到本地瀏覽器的傳輸協(xié)議刽严,它可以使瀏覽器更加高效瘪撇,使網(wǎng)絡(luò)傳輸減少。

HTTPS:是以安全為目標(biāo)的HTTP通道港庄,簡單講是HTTP的安全版,即HTTP下加入SSL層恕曲,HTTPS的安全基礎(chǔ)是SSL鹏氧,因此加密的詳細(xì)內(nèi)容就需要SSL。

HTTPS協(xié)議的主要作用可以分為兩種:一種是建立一個信息安全通道佩谣,來保證數(shù)據(jù)傳輸?shù)陌踩鸦梗涣硪环N就是確認(rèn)網(wǎng)站的真實(shí)性。

二茸俭、HTTP與HTTPS有什么區(qū)別吊履?

HTTP協(xié)議傳輸?shù)臄?shù)據(jù)都是未加密的,也就是明文的调鬓,因此使用HTTP協(xié)議傳輸隱私信息非常不安全艇炎,為了保證這些隱私數(shù)據(jù)能加密傳輸,于是網(wǎng)景公司設(shè)計了SSL(Secure Sockets Layer)協(xié)議用于對HTTP協(xié)議傳輸?shù)臄?shù)據(jù)進(jìn)行加密腾窝,從而就誕生了HTTPS缀踪。

關(guān)于GEOquery和getGEO簡略的前世今生,getGEO來自GEOquery這個包,那GEOquery這個包是從biobase包來的,要學(xué)會看幫助文檔,包括Description居砖、Usage、Arguments(命令)等

Get a GEO object from NCBI or file

Description

This function is the main user-level function in the GEOquery package. It directs the download (if no filename is specified) and parsing of a GEO SOFT format file into an R data structure specifically designed to make access to each of the important parts of the GEO SOFT format easily accessible.

Usage

getGEO(GEO = NULL, filename = NULL, destdir = tempdir(),
GSElimits = NULL, GSEMatrix = TRUE, AnnotGPL = FALSE, getGPL = TRUE,
parseCharacteristics = TRUE)

Arguments

GEO A character string representing a GEO object for download and parsing. (eg., 'GDS505','GSE2','GSM2','GPL96')
filename The filename of a previously downloaded GEO SOFT format file or its gzipped representation (in which case the filename must end in .gz). Either one of GEO or filename may be specified, not both. GEO series matrix files are also handled. Note that since a single file is being parsed, the return value is not a list of esets, but a single eset when GSE matrix files are parsed.
destdir The destination directory for any downloads. Defaults to the architecture-dependent tempdir. You may want to specify a different directory if you want to save the file for later use. Doing so is a good idea if you have a slow connection, as some of the GEO files are HUGE!
GSElimits This argument can be used to load only a contiguous subset of the GSMs from a GSE. It should be specified as a vector of length 2 specifying the start and end (inclusive) GSMs to load. This could be useful for splitting up large GSEs into more manageable parts, for example.
GSEMatrix A boolean telling GEOquery whether or not to use GSE Series Matrix files from GEO. The parsing of these files can be many orders-of-magnitude faster than parsing the GSE SOFT format files. Defaults to TRUE, meaning that the SOFT format parsing will not occur; set to FALSE if you for some reason need other columns from the GSE records.
AnnotGPL A boolean defaulting to FALSE as to whether or not to use the Annotation GPL information. These files are nice to use because they contain up-to-date information remapped from Entrez Gene on a regular basis. However, they do not exist for all GPLs; in general, they are only available for GPLs referenced by a GDS
getGPL A boolean defaulting to TRUE as to whether or not to download and include GPL information when getting a GSEMatrix file. You may want to set this to FALSE if you know that you are going to annotate your featureData using Bioconductor tools rather than relying on information provided through NCBI GEO. Download times can also be greatly reduced by specifying FALSE.
parseCharacteristics A boolean defaulting to TRUE as to whether or not to parse the characteristics information (if available) for a GSE Matrix file. Set this to FALSE if you experience trouble while parsing the characteristics.

Details

getGEO functions to download and parse information available from NCBI GEO (http://www.ncbi.nlm.nih.gov/geo). Here are some details about what is avaible from GEO. All entity types are handled by getGEO and essentially any information in the GEO SOFT format is reflected in the resulting data structure.

From the GEO website:

The Gene Expression Omnibus (GEO) from NCBI serves as a public repository for a wide range of high-throughput experimental data. These data include single and dual channel microarray-based experiments measuring mRNA, genomic DNA, and protein abundance, as well as non-array techniques such as serial analysis of gene expression (SAGE), and mass spectrometry proteomic data. At the most basic level of organization of GEO, there are three entity types that may be supplied by users: Platforms, Samples, and Series. Additionally, there is a curated entity called a GEO dataset.

A Platform record describes the list of elements on the array (e.g., cDNAs, oligonucleotide probesets, ORFs, antibodies) or the list of elements that may be detected and quantified in that experiment (e.g., SAGE tags, peptides). Each Platform record is assigned a unique and stable GEO accession number (GPLxxx). A Platform may reference many Samples that have been submitted by multiple submitters.

A Sample record describes the conditions under which an individual Sample was handled, the manipulations it underwent, and the abundance measurement of each element derived from it. Each Sample record is assigned a unique and stable GEO accession number (GSMxxx). A Sample entity must reference only one Platform and may be included in multiple Series.

A Series record defines a set of related Samples considered to be part of a group, how the Samples are related, and if and how they are ordered. A Series provides a focal point and description of the experiment as a whole. Series records may also contain tables describing extracted data, summary conclusions, or analyses. Each Series record is assigned a unique and stable GEO accession number (GSExxx).

GEO DataSets (GDSxxx) are curated sets of GEO Sample data. A GDS record represents a collection of biologically and statistically comparable GEO Samples and forms the basis of GEO's suite of data display and analysis tools. Samples within a GDS refer to the same Platform, that is, they share a common set of probe elements. Value measurements for each Sample within a GDS are assumed to be calculated in an equivalent manner, that is, considerations such as background processing and normalization are consistent across the dataset. Information reflecting experimental design is provided through GDS subsets.

Value

An object of the appropriate class (GDS, GPL, GSM, or GSE) is returned. If the GSEMatrix option is used, then a list of ExpressionSet objects is returned, one for each SeriesMatrix file associated with the GSE accesion. If the filename argument is used in combination with a GSEMatrix file, then the return value is a single ExpressionSet.

Warning

Some of the files that are downloaded, particularly those associated with GSE entries from GEO are absolutely ENORMOUS and parsing them can take quite some time and memory. So, particularly when working with large GSE entries, expect that you may need a good chunk of memory and that coffee may be involved when parsing....

Author(s)

Sean Davis

See Also

getGEOfile

Examples

gds <- getGEO("GDS10")
gds

gse <- getGEO('GSE10')
# Returns a list, so look at first item

gse[[1]]
#在下載GEO數(shù)據(jù)時,是這么下載的
rm(list = ls())
options("repos"="https://mirrors.ustc.edu.cn/CRAN/")
if(!require("BiocManager")) install.packages("BiocManager",update = F,ask = F)
options(BioC_mirror="https://mirrors.ustc.edu.cn/bioc/")
library(GEOquery)
f<-'GSE54839.Rdata'
####getGPL獲得平臺的注釋信息驴娃,但下載速度會慢很多
####而且注釋文件格式大多不如bioconductor包好用
if(!file.exists(f)){
  gset<-getGEO('GSE54839',destdir='.',
               AnnotGPL=F,
               getGPL=F)
  save(gset,file=f)
}
#數(shù)據(jù)提取
load('GSE54839.Rdata')
class(gset)
> class(gset)
[1] "list"
#取列表中的元素
> gset[[1]]
image-20190805131454342

這時就要去了解 這Biobase和ExpressionSet是什么呢,可以?ExpressionSet,也可以去Bioconductor官網(wǎng)查看Biobase包,下面這張圖片記錄bioconductor 上面幾個非常有幫助的模塊,如箭頭所示,其中common work flows可以看到各個主流分析的HTML文檔,按操作可以出圖.

image-20190805105359106
image-20190805133818559
文章名字是Orchestrationg~

其實(shí)呢,在library(GEOquery)也可以看到,如下圖所示

image-20190805134701042

現(xiàn)在知道得到的數(shù)據(jù)是一個ExpressionSet,關(guān)于ExpressionSet的解釋,在bioconductor也有官方文檔解釋,網(wǎng)址是:https://bioconductor.org/packages/release/bioc/vignettes/Biobase/inst/doc/ExpressionSetIntroduction.pdf

文檔截圖

現(xiàn)在問題是如何獲取ExpressionSet里面的注入phenoData奏候、experimentData呢?

那就去看ExpressionSet里面的幫助文檔

expr()

但是這個phenoData的信息如何提取沒有說,繼續(xù)找

image-20190805140750925
image-20190805140914329

好了,那就繼續(xù)在代碼去輸入

ex<- exprs(gset[[1]])#表達(dá)矩陣
pd <- pData(gset[[1]])#臨床信息
ex
pd

參考:https://github.com/bioconductor-china/basic/blob/master/ExpressionSet.md

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市唇敞,隨后出現(xiàn)的幾起案子蔗草,更是在濱河造成了極大的恐慌,老刑警劉巖疆柔,帶你破解...
    沈念sama閱讀 217,277評論 6 503
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件咒精,死亡現(xiàn)場離奇詭異,居然都是意外死亡婆硬,警方通過查閱死者的電腦和手機(jī)狠轻,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,689評論 3 393
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來彬犯,“玉大人向楼,你說我怎么就攤上這事⌒城” “怎么了湖蜕?”我有些...
    開封第一講書人閱讀 163,624評論 0 353
  • 文/不壞的土叔 我叫張陵,是天一觀的道長宋列。 經(jīng)常有香客問我昭抒,道長,這世上最難降的妖魔是什么炼杖? 我笑而不...
    開封第一講書人閱讀 58,356評論 1 293
  • 正文 為了忘掉前任灭返,我火速辦了婚禮,結(jié)果婚禮上坤邪,老公的妹妹穿的比我還像新娘熙含。我一直安慰自己,他們只是感情好艇纺,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,402評論 6 392
  • 文/花漫 我一把揭開白布怎静。 她就那樣靜靜地躺著,像睡著了一般黔衡。 火紅的嫁衣襯著肌膚如雪蚓聘。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 51,292評論 1 301
  • 那天盟劫,我揣著相機(jī)與錄音夜牡,去河邊找鬼。 笑死捞高,一個胖子當(dāng)著我的面吹牛氯材,可吹牛的內(nèi)容都是我干的渣锦。 我是一名探鬼主播,決...
    沈念sama閱讀 40,135評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼氢哮,長吁一口氣:“原來是場噩夢啊……” “哼袋毙!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起冗尤,我...
    開封第一講書人閱讀 38,992評論 0 275
  • 序言:老撾萬榮一對情侶失蹤听盖,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后裂七,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體皆看,經(jīng)...
    沈念sama閱讀 45,429評論 1 314
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,636評論 3 334
  • 正文 我和宋清朗相戀三年背零,在試婚紗的時候發(fā)現(xiàn)自己被綠了腰吟。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 39,785評論 1 348
  • 序言:一個原本活蹦亂跳的男人離奇死亡徙瓶,死狀恐怖毛雇,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情侦镇,我是刑警寧澤灵疮,帶...
    沈念sama閱讀 35,492評論 5 345
  • 正文 年R本政府宣布,位于F島的核電站壳繁,受9級特大地震影響震捣,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜闹炉,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,092評論 3 328
  • 文/蒙蒙 一蒿赢、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧渣触,春花似錦诉植、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,723評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽舌稀。三九已至啊犬,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間壁查,已是汗流浹背觉至。 一陣腳步聲響...
    開封第一講書人閱讀 32,858評論 1 269
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留睡腿,地道東北人语御。 一個月前我還...
    沈念sama閱讀 47,891評論 2 370
  • 正文 我出身青樓峻贮,卻偏偏與公主長得像,于是被迫代替她去往敵國和親应闯。 傳聞我的和親對象是個殘疾皇子纤控,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,713評論 2 354