1.R與Rstudio??####
生信第一步奋救,穿上打底褲滋觉。
打底褲品牌可以‘R語(yǔ)言’,也可是‘python’昌渤。
不過(guò)大多數(shù)人穿的都是R語(yǔ)言牌赴穗。建議你也先穿這個(gè)牌子。穿破了以后可以再換新的膀息。
打底褲穿好般眉,再穿個(gè)褲子。褲子品牌建議選Rstudio
一般人不會(huì)穿著打底褲就出來(lái)上街的潜支。
1.1安裝R
1.2安裝Rstudio
http://www.rstudio.com/download
最好默認(rèn)C盤安裝甸赃,否則容易出錯(cuò),今天不出錯(cuò)冗酿,明天也可能出錯(cuò)埠对。
2.R語(yǔ)言第一種數(shù)據(jù)類型---向量
2.1.向量生成??#####
(1)用 c() 結(jié)合到一起
> c(2,5,6,2,9)
[1] 2 5 6 2 9
> c("a","f","md","b")
[1] "a" "f" "md" "b"
(2)連續(xù)的數(shù)字用冒號(hào)“:”
> 1:5
[1] 1 2 3 4 5
(3)有重復(fù)的用rep(),有規(guī)律的序列用seq(),隨機(jī)數(shù)用rnorm
rep("gene",times=3)
[1] "gene" "gene" "gene"
seq(from=3,to=21,by=3)
[1] 3 6 9 12 15 18 21
rnorm(n=3)
[1] 0.07456498 -1.98935170 0.61982575
set.seed(1)#保證別人再次重新運(yùn)行腳本的時(shí)候可重復(fù) 而不是保證上下兩行是重復(fù)
rnorm(5)
[1] -0.6264538 0.1836433 -0.8356286 1.5952808 0.3295078
rnorm(5)
[1] -0.8204684 0.4874291 0.7383247 0.5757814 -0.3053884
(4)通過(guò)組合,產(chǎn)生更為復(fù)雜的向量。
paste0(rep("gene",times=3),1:3)
[1] "gene1" "gene2" "gene3"
2.2對(duì)單個(gè)向量進(jìn)行的操作####
> #(1)賦值給一個(gè)變量名
> x = c(1,3,5,1) #隨意的寫法=
> x
[1] 1 3 5 1
> x <- c(1,3,5,1) #規(guī)范的賦值符號(hào)Alt+減號(hào)
> x
[1] 1 3 5 1
>
> #賦值+輸出一起實(shí)現(xiàn)
> x <- c(1,3,5,1);x
[1] 1 3 5 1
> (x <- c(1,3,5,1))
[1] 1 3 5 1
(2)簡(jiǎn)單數(shù)學(xué)計(jì)算 向量循環(huán) 類似循環(huán)
> x+1
[1] 2 4 6 2
> log(x)
[1] 0.000000 1.098612 1.609438 0.000000
> sqrt(x)
[1] 1.000000 1.732051 2.236068 1.000000
(3)根據(jù)某條件進(jìn)行判斷,生成邏輯型向量
x>3
[1] FALSE FALSE TRUE FALSE
> x==3
[1] FALSE TRUE FALSE FALSE
> #(4)初級(jí)統(tǒng)計(jì)
> max(x) #最大值
[1] 5
> min(x) #最小值
[1] 1
> mean(x) #均值
[1] 2.5
> median(x) #中位數(shù)
[1] 2
> var(x) #方差
[1] 3.666667
> sd(x) #標(biāo)準(zhǔn)差
[1] 1.914854
> sum(x) #總和
[1] 10
>
> length(x) #長(zhǎng)度
[1] 4
> unique(x) #去重復(fù)
[1] 1 3 5
> duplicated(x) #對(duì)應(yīng)元素是否重復(fù)
[1] FALSE FALSE FALSE TRUE
> table(x) #重復(fù)值統(tǒng)計(jì)
x
1 3 5
2 1 1
> sort(x)
[1] 1 1 3 5
2.3.對(duì)兩個(gè)向量進(jìn)行的操作#####
> x = c(1,3,5,1)
> y = c(3,2,5,6)
> #(1)邏輯比較裁替,生成等長(zhǎng)的邏輯向量
> x == y
[1] FALSE FALSE TRUE FALSE
> x %in% y #x中的元素在y中嗎
[1] FALSE TRUE TRUE FALSE
> #(2)數(shù)學(xué)計(jì)算
> x + y
[1] 4 5 10 7
> #(3)“連接“
> paste(x,y,sep=":")
[1] "1:3" "3:2" "5:5" "1:6"
> paste(x,y,sep='')
[1] "13" "32" "55" "16"
> paste(x,y)#默認(rèn)是空格
[1] "1 3" "3 2" "5 5" "1 6"
> #(4)交集项玛、并集、差集
> intersect(x,y)
[1] 3 5
> union(x,y)
[1] 1 3 5 2 6
> setdiff(x,y)#前有后沒有的
[1] 1
> setdiff(y,x)#前有后沒有的
[1] 2 6
> #當(dāng)兩個(gè)向量長(zhǎng)度不一致
> x = c(1,3,5,6,2)
> y = c(3,2,5)
> x == y # 叭跖小襟沮!warning啦!
[1] FALSE FALSE TRUE FALSE TRUE
Warning message:
In x == y : longer object length is not a multiple of shorter object length
> #循環(huán)補(bǔ)齊--看ppt 用短的補(bǔ)齊長(zhǎng)的 得到長(zhǎng)的數(shù)值
>
> #利用循環(huán)補(bǔ)齊簡(jiǎn)化代碼
> paste0(rep("gene",3),1:3)#paste0 = paste(x,y,sep='')
[1] "gene1" "gene2" "gene3"
> paste0("gene",1:3)
[1] "gene1" "gene2" "gene3"
2.4.向量篩選(取子集)--看ppt#####
> x <- 8:12
> #根據(jù)邏輯值取子集
> x[x==10]#== 是否等于
[1] 10
> x[x<12]
[1] 8 9 10 11
> x[x %in% c(9,13)]
[1] 9
> #根據(jù)位置取子集
> x[4]
[1] 11
> x[2:4]
[1] 9 10 11
> x[c(1,5)]
[1] 8 12
> x[-4]#反選
[1] 8 9 10 12
> x[-(2:4)]#反選
[1] 8 12
2.5.修改向量中的某個(gè)/某些元素:取子集+重新賦值
> x[4] <- 40
> x
[1] 8 9 10 40 12
#主動(dòng)寫才會(huì)出來(lái)x[x>10] <- 10
> x
[1] 8 9 10 40 12
2.6 簡(jiǎn)單向量作圖
> k1 = rnorm(12);k1
[1] 1.51178117 0.38984324 -0.62124058 -2.21469989 1.12493092 -0.04493361
[7] -0.01619026 0.94383621 0.82122120 0.59390132 0.91897737 0.78213630
> k2 = rep(c("a","b","c","d"),each = 3);k2
[1] "a" "a" "a" "b" "b" "b" "c" "c" "c" "d" "d" "d"
> plot(k1)
> boxplot(k1~k2) #試著搜索boxplot表達(dá)什么意思
難點(diǎn)--向量匹配排序:match
> x <- c("A","B","C","D","E")
> y <- c("B","D","E","A","C")
> match(x,y)
[1] 4 1 5 2 3
> #生成一個(gè)向量#[1] 4 1 5 2 3
>
> y[match(x,y)] #根據(jù)x裕循,調(diào)整y的順序
[1] "A" "B" "C" "D" "E"
> #前面是模板或者標(biāo)尺[1] "A" "B" "C" "D" "E"
> x[match(y,x)] #根據(jù)y臣嚣,調(diào)整x的順序
[1] "B" "D" "E" "A" "C"
> #[1] "B" "D" "E" "A" "C"