記錄《Bioinformatics Data Skills》中關(guān)于R的實用操作

##################2019年1月18日14:34:07##########################

example("pheatmap") #獲取函數(shù)的示例
help.search("heatmap") #根據(jù)關(guān)鍵詞搜索相關(guān)的函數(shù)
library(help="pheatmap") #查看包的詳細信息
ls() #We can see objects we’ve created in the global environment 
length() #return the length of vector

Alt - on Windows 快捷生成 “<-”

特點

R does not have a type for a single value (known as a scalar) such as 3.1 or “AGCTACGACT.” Rather, these values are stored in a vector of length 1.
（R沒有類型的變量用來存儲一個值乙漓，例如字符串xx，相對應(yīng)，這些值被存儲在長度為1的向量中）
R’s vectors are the basis of one of R’s most important features: vectorization. Vectorization allows us to loop over vectors elementwise, without the need to write an explicit loop.
（向量的一個重要特點是能夠?qū)υ剡M行迭代而不需要明確的循環(huán)）

################2019年1月22日09:48:01#######################

When we assign a value in our R session, we’re assigning it to an environment known
as the global environment.
Calling the function search() returns where R looks when searching for the value of a variable—which includes the global environment (.GlobalEnv) and attached packages.
(當使用search()查找變量的值時，會返回R在全局變量(.GlobalEnv）以及相應(yīng)的包中查找的結(jié)果寸士。
if one vector is longer than the other, R will recycle the values in the
shorter vector. This is an intentional behavior, so R won’t warn you when this hap‐
pens

> x <- c(1,2,3)
> x + 1
[1] 2 3 4
> y <- c(1,2)
> x + y #當兩個元素的向量不是乘積倍的時候
[1] 2 4 4
Warning message:
In x + y : longer object length is not a multiple of shorter object length

R will return a missing value (NA; more on this later) if you try to access an ele‐
ment in a position that’s greater than the number of elements.

> z[c(2, 1, 10)]
[1] 2.2 3.4 NA

It’s also possible to exclude certain elements from lists using negative indexes
（使用負號來跳過數(shù)據(jù)）

> order(z)
[1] 4 3 5 2 1
> z[order(z)]
> order(z, decreasing=TRUE)
[1] 1 2 5 3 4
> z[order(z, decreasing=TRUE)] #order返回排序后的索引
[1] 3.4 2.2 1.2 0.4 -0.4
> sort(b,decreasing = T) #返回排序后的值
  b  a1  a3  a2   c 
5.4 3.4 2.0 1.0 0.4

Again, often we use functions to generate indexing vectors for us. For example, one
way to resample a vector (with replacement) is to randomly sample its indexes using
the sample() function:
[1] http://www.reibang.com/p/38d0a44630f8
[2] https://bbs.pinggu.org/thread-3068145-1-1.html

> set.seed(0) # we set the random number seed so this example is reproducible
> i <- sample(length(z), replace=TRUE) #replace是否放回取樣
> i
[1] 5 2 2 3 5
> z[i]
[1] 1.2 2.2 2.2 0.4 1.2

NA is R’s built-in value to represent missing data.
NULL represents not having a value
-Inf, Inf These are just as they sound, negative infinite and positive infinite values.
NaN stands for “not a number,” which can occur in some computations that don’t
return numbers, i.e., 0/0 or Inf + -Inf.

> is.nan(0/0)
[1] TRUE
> x <- c()
> is.null(x)
[1] TRUE
> y <- c(1,2,3)
> is.na(y[4])
[1] TRUE

Because all elements in a vector must have homogeneous data type, R will silently coerce elements so that they have the same type.
(當構(gòu)建向量時，R會自動進行數(shù)據(jù)類的強轉(zhuǎn)历涝。）

When called on numeric values, summary() returns a numeric summary with the
quartiles and the mean.
Likewise, R’s data-reading functions can also read gzipped files directly—there’s
no need to uncompress gzipped files first.
reshape2 package provides functions to reshape data: the function melt()
turns wide data into long data, and cast() turns long data into wide data.
One nice feature of data.frame() is that if you provide vectors as named arguments, data.frame() will use these names as column names.
################2019年1月23日09:29:13#######################
Omitting the row index retrieves all rows, and omitting the column index retrieves all columns.
（省略列索引將檢索所有的行兰迫，省略行索引將檢索所有的列。）

> y <- cbind(x1 = 3, x2 = c(4:1))
> y
     x1 x2
[1,]  3  4
[2,]  3  3
[3,]  3  2
[4,]  3  1
> y['x1']
[1] NA
> y[1,'x1']
x1 
 3 
> y[,'x1'] 
[1] 3 3 3 3

It’s a good idea to avoid referring to specific dataframe rows in your
analysis code.
From summary(), we see that this varies quite considerably across all windows on chromosome 20:

> summary(d$total.SNPs)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 3.000 7.000 8.906 12.000 93.000

Remember, columns of a dataframe are just vectors. If you only need the data from
one column, just subset it as you would a vector:
Note that there’s no need to use a comma in the bracket because d$percent is a vector, not a two-dimensional dataframe

> d$percent.GC[d$Pi > 16]
[1] 39.1391 38.0380 36.8368 36.7367 43.0430 41.1411 [...]

Thus, d[$Pi > 3, ] is identical to d[which(d$Pi > 3), ];

> d$Pi > 3
[1] FALSE TRUE FALSE TRUE TRUE TRUE [...]
> which(d$Pi > 3)
[1] 2 4 5 6 7 10 [...]

subset() takes two arguments: the dataframe to operate on, and then conditions to include a
row. With subset(), d[d $Pi > 16 & d$ percent.GC > 80, ] can be expressed as:

$ subset(d, Pi > 16 & percent.GC > 80)
start end total.SNPs total.Bases depth [...]
58550 63097001 63098000 5 947 2.39 [...]

Note that we (somewhat magically) don’t need to quote column names. This is
because subset() follows special evaluation rules, and for this reason, subset() is
best used only for interactive work.

> subset(d, Pi > 16 & percent.GC > 80,
c(start, end, Pi, percent.GC, depth))
start end Pi percent.GC depth
58550 63097001 63098000 41.172 82.0821 2.39
58641 63188001 63189000 16.436 82.3824 3.21
58642 63189001 63190000 41.099 80.5806 1.89

#####################ggplot2##################

ggplot2 works exclusively with dataframes, so you’ll need to get your data tidy and into a dataframe before visualizing it with ggplot2.
Each layer updates our plot by adding geometric objects such as the points in a scatterplot, or the lines in a line plot.
Geom = Geometric =幾何學
aes =aesthetic = 美學的
We specify the mapping of aesthetic attributes to columns in our dataframe using the function aes().

最后編輯于：2019.01.23 17:40:20

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末秧廉，一起剝皮案震驚了整個濱河市伞广，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌疼电，老刑警劉巖嚼锄，帶你破解...
沈念sama閱讀 206,968評論 6贊 482
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異蔽豺，居然都是意外死亡区丑，警方通過查閱死者的電腦和手機，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 88,601評論 2贊 382
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來沧侥，“玉大人可霎，你說我怎么就攤上這事⊙缟保” “怎么了啥纸？”我有些...
開封第一講書人閱讀 153,220評論 0贊 344
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長婴氮。經(jīng)常有香客問我斯棒，道長，這世上最難降的妖魔是什么主经？我笑而不...
開封第一講書人閱讀 55,416評論 1贊 279
?港島之戀（遺憾婚禮）
正文為了忘掉前任荣暮，我火速辦了婚禮，結(jié)果婚禮上罩驻，老公的妹妹穿的比我還像新娘穗酥。我一直安慰自己，他們只是感情好惠遏，可當我...
茶點故事閱讀 64,425評論 5贊 374
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布砾跃。她就那樣靜靜地躺著，像睡著了一般节吮。火紅的嫁衣襯著肌膚如雪抽高。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 49,144評論 1贊 285
城市分裂傳說
那天透绩，我揣著相機與錄音翘骂，去河邊找鬼。笑死帚豪，一個胖子當著我的面吹牛碳竟，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播狸臣，決...
沈念sama閱讀 38,432評論 3贊 401
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼莹桅，長吁一口氣：“原來是場噩夢啊……” “哼！你這毒婦竟也來了烛亦？” 一聲冷哼從身側(cè)響起诈泼，我...
開封第一講書人閱讀 37,088評論 0贊 261
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤，失蹤者是張志新（化名）和其女友劉穎此洲，沒想到半個月后厂汗，有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 43,586評論 1贊 300
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡呜师，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 36,028評論 2贊 325
?白月光啟示錄
正文我和宋清朗相戀三年娶桦，在試婚紗的時候發(fā)現(xiàn)自己被綠了。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點故事閱讀 38,137評論 1贊 334
活死人
序言：一個原本活蹦亂跳的男人離奇死亡衷畦，死狀恐怖栗涂，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情祈争，我是刑警寧澤斤程，帶...
沈念sama閱讀 33,783評論 4贊 324
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站菩混，受9級特大地震影響忿墅，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜沮峡，卻給世界環(huán)境...
茶點故事閱讀 39,343評論 3贊 307
男人毒藥：我在死后第九天來索命
文/蒙蒙一疚脐、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧邢疙，春花似錦棍弄、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,333評論 0贊 19
一樁弒父案呼畸，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至颁虐，卻和暖如春蛮原，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背聪廉。一陣腳步聲響...
開封第一講書人閱讀 31,559評論 1贊 262
情欲美人皮
我被黑心中介騙來泰國打工瞬痘，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留，地道東北人板熊。一個月前我還...
沈念sama閱讀 45,595評論 2贊 355
代替公主和親
正文我出身青樓，卻偏偏與公主長得像察绷，于是被迫代替她去往敵國和親干签。傳聞我的和親對象是個殘疾皇子，可洞房花燭夜當晚...
茶點故事閱讀 42,901評論 2贊 345

記錄《Bioinformatics Data Skills》中關(guān)于R的實用操作

特點

推薦閱讀更多精彩內(nèi)容