screening the dataset
兩個目的:1遺失的數(shù)據(jù) check for missing data
2 奇怪的 和 錯誤的數(shù)據(jù)?
什么算是奇怪的數(shù)據(jù)英古?
consistency check 前后回答不一致的
filler questions 是草丧??
極端的數(shù)據(jù) 怎么算極端拟赊??
如何做?
1 analyze frequencies 頻率鸣哀,- check missing data and extreme data 搔体?
2 scatter plot 分布圖 - check consistency
*不會 spss- scatter plot , select cases
對’壞數(shù)據(jù)‘做什么靡狞?
啥都不做
收集更多數(shù)據(jù)
assign missing value-
for not key variables, 填充平均數(shù) substitute neutral values, usually the mean
impute values (根據(jù)附近的數(shù)值填充)
刪掉
決定主要是取決于how many good repondents there are
analyzing dataset
levels of measurement?
assigning number ,spss-values
spss中的scale是指 metric data,包括interval和ratio拘央。
nominal 類別
ordinal 排序
interval 評分什么的 1—10?
ratio 有含義的數(shù)據(jù)
數(shù)據(jù)檢驗statistical tests 就取決于 度量的類型 the level of measurement of a variable
types of statistical analyses
1描述分析descriptive analysis涂屁。總結(jié)樣本灰伟,頻率分析
2推斷 inferential analysis拆又,由樣本推總體儒旬,假設(shè)檢驗 和 confidence intervals(可能存在一個模型啥的) ,one-sample
3比較分析 differences analysis , 比較兩組或多組數(shù)據(jù)mean帖族。differences among means.?
4關(guān)聯(lián)分析 associative analysis,考察一個關(guān)系的strength and direction. cross-tabulations and correlations.
5預(yù)測 predictive analysis: regressions.
descriptive analysis
summarize data 總結(jié)樣本
HOW 如何總結(jié)栈源,(總結(jié)啥)? (一般來說 這些數(shù)據(jù)有意義嗎)
-descriptive analysis 那一套?
1. location: mode , median ,mean
2.variability: (interquartile)range, variance , standard deviation (為啥有了方差還要標(biāo)準(zhǔn)差)竖般,coefficient of variation: =standard deviation/mean?
3.shape : skewness, kurtosis?
*注意:描述分析的意義depending on the level of measurement?
adjusting data?
re-specifying variables 啥意思??
transforming scales -standardizing z-scores
weighing cases/ respondent (不經(jīng)常用)啥意思甚垦? to account for representativeness.
hypothesis testing
1.two-sided tests (等于or不等)
Ho: 變量的參數(shù)是等于某值 the parameter (mean, proportion )of the variable is equal?
H1:the parameter of the variable is different
2.one-sided tests (大于小于)
Ho: 大于等于 or 小于等于
H1:< or >
結(jié)果可以有兩種,一種是test statistic 另一種是p-value.(test statistic 越大涣雕,p-value就越小艰亮,Ho的可能性就越小) 見圖?
所以,test statistic >critical value 就拒絕
p-value <0.05 拒絕?
spss中挣郭,p-value 顯示為“Sig.”
p≤0.05迄埃,Ho is rejected → the parameter is significantly different from xx.
0.05<p≤0.1,Ho is rejected but marginally → the parameter is marginally significantly different from xx.
p >0.1, Ho is not rejected → the parameter is not statistically different from xx.
test statistic?
test statistic > critical value, Ho is rejected?
diagram 'when to use which test?'
圖~
怎么用這張表丈屹??-3 questions:
1. what is the dependent variable?
2.what is the measurement level of the dependent variable??
3.what and how many samples does the hypothesis involve??
-one sample: 比較給定組的參數(shù) (和某一值~)
-independent samples:比較兩個組的參數(shù)调俘。eg. man/woman, branded/unbranded
-related samples: compare the responses of the same individual amongst each other. 其實是同一個樣本 對不同問題的回答 醬紫?
inferential analysis: one-sample tests. representativeness
推斷是否具有代表性旺垒,和給定的某一值比較
Ho:mean in the population where the sample came from =2.28
首先彩库,DV=household size ,DV measurement= ratio ?sample: one sample (必要步驟)
所以(查看表格),用one sample t-test?
eg2:檢驗 房屋分布的比例是否和統(tǒng)計數(shù)據(jù)一致
首先先蒋,DV=sample household proportion, DV measurement= ordinal, sample =one sample?
所以用one sample Kolmogorov- smirnov (by hand or excel )
total population 中的cumulative percentage 和樣本observed cumulative% 計算absolute difference?
test statistic = 最大的那個difference → K=xx
critical value at 5%=1.36 除以 根號下樣本個數(shù) =aa
K 大于 aa →Ho is rejected 顯著不同
檢驗二分法中的比例 the proportion of a dichotomous variable (yes/no)
用Z-test (by hand)
differential analysis:two and more independent or related samples
表格的運用骇钦,見onenote
associative analysis: correlations
變量間的關(guān)系
when there are 2 variables?
both are metric(interval /ratio ), linear relationship , use pearson correlation coefficient?
one or both are ordinal, use spearman rank correlation coefficient?
r 屬于[-1,1]
significant vs. substantive results.
significant 取決于1 “不同”或“相關(guān)”的strength、magnitude竞漾? 以及 2樣本大小 sample size
sig是第一步眯搭,relevance是一個主觀判斷
sig difference or correlation 不能推斷出substantive or relevant?
magnitude of the difference =% change in the response of one group from that of the comparision group?