轉(zhuǎn)載自https://blog.csdn.net/hsdcc217/article/details/78510087
在R語(yǔ)言中舍哄,因子(factor)表示的是一個(gè)編號(hào)或者一個(gè)等級(jí)艘儒,即,一個(gè)點(diǎn)。例如伤塌,人的個(gè)數(shù)可以是1缴允,2,3杯巨,4……那么因子就包括蚤告,1,2服爷,3杜恰,4…..還有描述協(xié)變量水平時(shí),會(huì)用到高仍源、中心褐、低,也是因子笼踩,因?yàn)檫@些都是一個(gè)點(diǎn)逗爹。與之區(qū)別的向量,是一個(gè)連續(xù)性的值嚎于,例如掘而,數(shù)值中有1,1.1匾旭,1.2……可以作為數(shù)值來(lái)計(jì)算镣屹,而因子則不可以。簡(jiǎn)單通俗來(lái)講:因子是一個(gè)點(diǎn)价涝,向量是一個(gè)有方向的范圍女蜈。在R中,如果把數(shù)字作為因子,那么在導(dǎo)入數(shù)據(jù)之后伪窖,需要將向量轉(zhuǎn)換為因子(factor)逸寓,而因子在整個(gè)計(jì)算過(guò)程中不再作為數(shù)值,而是一個(gè)”符號(hào)”而已覆山。
以實(shí)例進(jìn)行解釋和說(shuō)明
data <- c(1,2,2,3,1,2,3,3,1,2,3,3,1)
> data
[1] 1 2 2 3 1 2 3 3 1 2 3 3 1
> fdata <- factor(data)
> fdata
[1] 1 2 2 3 1 2 3 3 1 2 3 3 1
Levels: 1 2 3
> class(fdata)
[1] "factor"
> class(data)
[1] "numeric"
#factor()函數(shù)將原來(lái)的數(shù)值型的向量轉(zhuǎn)化為了factor類型竹伸。factor類型的向量中有Levels的概念。Levels就是factor中的所有元素的集合(沒(méi)有重復(fù))簇宽。我們可以發(fā)現(xiàn)Levels就是factor中元素排除重復(fù)后且字符化的結(jié)果勋篓。因?yàn)長(zhǎng)evels的元素都是character。
> levels(fdata)
[1] "1" "2" "3"
#我們可以在factor生成時(shí)魏割,通過(guò)labels向量來(lái)指定levels譬嚣,繼續(xù)上面的程序:
> rdata <- factor(data,labels=c("I","II","III"))
> rdata
[1] I II II III I II III III I II III III I
Levels: I II III
> rdata <- factor(data,labels=c("e","ee","eee"))
> rdata
[1] e ee ee eee e ee eee eee e ee eee eee e
Levels: e ee eee
#factors可以指定數(shù)據(jù)的順序
> mons <- c("March","April","January","November","January", "September","October","September","November","August", "January","November","November","February","May","August", "July","December","August","August","September","November", "February","April")
> mons <- factor(mons)
> mons
[1] March April January November January
[6] September October September November August
[11] January November November February May
[16] August July December August August
[21] September November February April
11 Levels: April August December February ... September
> table(mons)
mons
April August December February January
2 4 1 2 3
July March May November October
1 1 1 5 1
September
3
#顯然月份是有順序的,我們可以為factor指定順序
mons = factor(mons,levels=c("January","February","March","April","May","June","July","August","September","October","November","December"),ordered=TRUE)
> table(mons)
mons
January February March April May
3 2 1 2 1
June July August September October
0 1 4 3 1
November December
5 1