Scale使用過程中需要注意以下幾點:
-
scale后接向量時关顷,輸出結(jié)果為array肩杈,注意不是向量
> test <- scale(c(1,2,3)) > test [,1] [1,] -1 [2,] 0 [3,] 1 attr(,"scaled:center") [1] 2 attr(,"scaled:scale") [1] 1 > is.array(test) [1] TRUE
-
scale后接矩陣
若輸入為矩陣,則以列為基準(zhǔn)去標(biāo)準(zhǔn)化解寝,生成的結(jié)果是一個array扩然,行列與原matrix一致
> test a b c [1,] 1 1 12 [2,] 3 6 11 [3,] 9 10 9 > test2 <- scale(test) > test2 a b c [1,] -0.8006408 -1.03490978 0.8728716 [2,] -0.3202563 0.07392213 0.2182179 [3,] 1.1208971 0.96098765 -1.0910895 attr(,"scaled:center") a b c 4.333333 5.666667 10.666667 attr(,"scaled:scale") a b c 4.163332 4.509250 1.527525 > is.array(test2) [1] TRUE
-
scale后接dataframe
> test <- data.frame(a=c(1,3,9),b=c(1,6,10),c=c(12,11,9),d=c(100,2,3)) > test a b c d 1 1 1 12 100 2 3 6 11 2 3 9 10 9 3 > scale(test) #默認以列為基準(zhǔn)進行標(biāo)準(zhǔn)化 #等同于 scale(as.matrix(test)) 以及 apply(test,2,scale) a b c d [1,] -0.8006408 -1.03490978 0.8728716 1.1546550 [2,] -0.3202563 0.07392213 0.2182179 -0.5862095 [3,] 1.1208971 0.96098765 -1.0910895 -0.5684455 attr(,"scaled:center") a b c d 4.333333 5.666667 10.666667 35.000000 attr(,"scaled:scale") a b c d 4.163332 4.509250 1.527525 56.293872 > scale(as.matrix(test)) a b c d [1,] -0.8006408 -1.03490978 0.8728716 1.1546550 [2,] -0.3202563 0.07392213 0.2182179 -0.5862095 [3,] 1.1208971 0.96098765 -1.0910895 -0.5684455 attr(,"scaled:center") a b c d 4.333333 5.666667 10.666667 35.000000 attr(,"scaled:scale") a b c d 4.163332 4.509250 1.527525 56.293872 > apply(test,2,scale) # 2 代表依據(jù)列執(zhí)行 a b c d [1,] -0.8006408 -1.03490978 0.8728716 1.1546550 [2,] -0.3202563 0.07392213 0.2182179 -0.5862095 [3,] 1.1208971 0.96098765 -1.0910895 -0.5684455 > apply(test,1,scale) # 1 代表依據(jù)行執(zhí)行,最終結(jié)果為何是這樣聋伦,因為第一行scale后得到的數(shù)據(jù)放到第一列了(類似scale后接向量的感覺)夫偶,第二行scale后放第二列了界睁,等等以此類推 # 等同于scale(t(as.matrix(test))) [,1] [,2] [,3] [1,] -0.5735393 -0.6185896 0.3904344 [2,] -0.5735393 0.1237179 0.7027819 [3,] -0.3441236 1.3608971 0.3904344 [4,] 1.4912023 -0.8660254 -1.4836507 > scale(t(as.matrix(test))) [,1] [,2] [,3] a -0.5735393 -0.6185896 0.3904344 b -0.5735393 0.1237179 0.7027819 c -0.3441236 1.3608971 0.3904344 d 1.4912023 -0.8660254 -1.4836507 attr(,"scaled:center") [1] 28.50 5.50 7.75 attr(,"scaled:scale") [1] 47.947888 4.041452 3.201562 #所以,如果想按行scale怎么處理 > t(apply(test,1,scale)) [,1] [,2] [,3] [,4] [1,] -0.5735393 -0.5735393 -0.3441236 1.4912023 [2,] -0.6185896 0.1237179 1.3608971 -0.8660254 [3,] 0.3904344 0.7027819 0.3904344 -1.4836507