若兩組數(shù)據(jù)獨立,可以使用wilcoxon秩和檢驗(mann-whitney U檢驗),來評估觀測是否是從相同的概率分布中抽取的(即,在一個總體中獲得更高得分的概率是否比另一個總體要大)
wilcox.test(y~x, data),其中,y是一個數(shù)值型變量,x是一個二分變量蕉汪。
> with(UScrime, by(Prob, So, median))
So: 0
[1] 0.038201
----------------------------------------------------------------------
So: 1
[1] 0.055552
> wilcox.test(Prob ~ So, data=UScrime)
? ? ? ? Wilcoxon rank sum test
data:? Prob by So
W = 81, p-value = 8.488e-05
alternative hypothesis: true location shift is not equal to 0
> sapply(UScrime[c("U1","U2")],median)
U1 U2
92 34
> with(UScrime, wilcox.test(U1,U2,paried=TRUE))
? ? ? ? Wilcoxon rank sum test with continuity correction
data:? U1 and U2
W = 2209, p-value < 2.2e-16
alternative hypothesis: true location shift is not equal to 0
Warning message:
In wilcox.test.default(U1, U2, paried = TRUE) :
? cannot compute exact p-value with ties
> states <- data.frame(state.region, state.x77)
> kruskal.test(Illiteracy~state.region, data=states)
? ? ? ? Kruskal-Wallis rank sum test
data:? Illiteracy by state.region
Kruskal-Wallis chi-squared = 22.672, df = 3, p-value = 4.726e-05
> #顯然,結(jié)果表明逞怨,美國各個地區(qū)的文盲率是各不相同的(p<0.001)
> source("http://www.statmethods.net/RiA/wmc.txt")
> states<- data.frame(state.region, state.x77)
> wmc(Illiteracy ~ state.region, data=states, method="holm")
Descriptive Statistics
? ? ? ? ? West North Central Northeast? ? South
n? ? ? 13.00000? ? ? 12.00000? 9.00000 16.00000
median? 0.60000? ? ? 0.70000? 1.10000? 1.75000
mad? ? 0.14826? ? ? 0.14826? 0.29652? 0.59304
Multiple Comparisons (Wilcoxon Rank Sum Tests)
Probability Adjustment = holm
? ? ? ? Group.1? ? ? Group.2? ? W? ? ? ? ? ? p? ?
1? ? ? ? ? West North Central 88.0 8.665618e-01? ?
2? ? ? ? ? West? ? Northeast 46.5 8.665618e-01? ?
3? ? ? ? ? West? ? ? ? South 39.0 1.788186e-02? *
4 North Central? ? Northeast 20.5 5.359707e-02? .
5 North Central? ? ? ? South? 2.0 8.051509e-05 ***
6? ? Northeast? ? ? ? South 18.0 1.187644e-02? *
---
Signif. codes:? 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>#source()函數(shù)下載并執(zhí)行了定義wmc()函數(shù)的R腳本者疤。函數(shù)的形式wmc(y~A, data, method),其中叠赦,y是數(shù)值輸出變量驹马,A是分組變量, data是包含這些變量的數(shù)據(jù)框眯搭,method指定限制I類誤差的方法窥翩。代碼清單7.17使用的是基于holm提出的調(diào)整方法,可以很大程度上控制總體I類誤差率鳞仙。
wmc()函數(shù)首先給出了樣本量寇蚊、樣本中位數(shù)、每組的絕對中位數(shù)棍好,其中仗岸,西部地區(qū)文盲率最低,南部地區(qū)文盲率最高借笙。然后扒怖,函數(shù)生成了六組統(tǒng)計比較∫导冢可以從雙側(cè)P看到盗痒,南部與其他三個區(qū)域有明顯差別,但當顯著性水平p<0.05時低散,其他三個區(qū)域間并沒有統(tǒng)計顯著的差別俯邓。
組間差異的非參數(shù)檢驗的基本知識到這就結(jié)束了,咱們下期再見熔号!O(∩_∩)O哈哈~