寫在前面的話
泰坦尼克號(hào)的沉沒是歷史上最臭名昭著的海難鹰祸。1912年4月5日,在她的處女航上密浑,泰坦尼克號(hào)由于撞上冰山而沉沒蛙婴,使得2224人中的1502永遠(yuǎn)的葬身海底。Machine Learning from Disaster 是Kaggle知名的數(shù)據(jù)分析入門練手項(xiàng)目尔破,參與者需要完成:數(shù)據(jù)預(yù)處理街图、特征工程、建模懒构、預(yù)測(cè)餐济、驗(yàn)證步驟,實(shí)現(xiàn)根據(jù)給出的891行訓(xùn)練數(shù)據(jù)(包含乘客或海員信息胆剧,以及是否生還)訓(xùn)練出的數(shù)據(jù)模型來預(yù)測(cè)其他418條記錄的乘客的生存情況絮姆,由于此項(xiàng)目真實(shí)模擬了現(xiàn)實(shí)數(shù)據(jù)分析過程流程,被評(píng)為五大最適合數(shù)據(jù)分析練手項(xiàng)目之一秩霍。
Five data science projects to learn data science
本文的基本按照下述流程進(jìn)行Machine Learning from Disaster數(shù)據(jù)集進(jìn)行分析:
- 數(shù)據(jù)清洗
- 特征工程
- 模型設(shè)計(jì)
- 預(yù)測(cè)
數(shù)據(jù)預(yù)處理
數(shù)據(jù)集來源
- 訓(xùn)練數(shù)據(jù)集:train.csv;
- 預(yù)測(cè)數(shù)據(jù)集:test.csv;
https://www.kaggle.com/c/titanic
數(shù)據(jù)導(dǎo)入與預(yù)覽
# 創(chuàng)建工程:Machine Learning from Disaster
# 加載包
library(dplyr)
library(stringr)
library(ggthemes)
library(ggplot2)
#加載完成后篙悯,導(dǎo)入數(shù)據(jù)
test<- read.csv("./db/test.csv", header = T, stringsAsFactors = F)
train <- read.csv("./db/train.csv", header = T, stringsAsFactors = F)
# 初步觀察數(shù)據(jù)
# 檢查數(shù)據(jù)
str(train)
str(test)
head(train)
head(test)
從結(jié)果可知:兩個(gè)的數(shù)據(jù)集除了test缺失Survived列,兩者數(shù)據(jù)框中的元素是完全一致
> str(train)
'data.frame': 891 obs. of 12 variables:
$ PassengerId: int 1 2 3 4 5 6 7 8 9 10 ...
$ Survived : int 0 1 1 1 0 0 0 0 1 1 ...
$ Pclass : int 3 1 3 1 3 3 1 3 3 2 ...
$ Name : chr "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ...
$ Sex : chr "male" "female" "female" "female" ...
$ Age : num 22 38 26 35 35 NA 54 2 27 14 ...
$ SibSp : int 1 1 0 1 0 0 0 3 0 1 ...
$ Parch : int 0 0 0 0 0 0 0 1 2 0 ...
$ Ticket : chr "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
$ Fare : num 7.25 71.28 7.92 53.1 8.05 ...
$ Cabin : chr "" "C85" "" "C123" ...
$ Embarked : chr "S" "C" "S" "S" ...
> head(test)
PassengerId Survived Pclass Name Sex Age SibSp Parch
1 1 0 3 Braund, Mr. Owen Harris male 22 1 0
2 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Thayer) female 38 1 0
3 3 1 3 Heikkinen, Miss. Laina female 26 0 0
4 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35 1 0
5 5 0 3 Allen, Mr. William Henry male 35 0 0
6 6 0 3 Moran, Mr. James male NA 0 0
Ticket Fare Cabin Embarked
1 A/5 21171 7.2500 S
2 PC 17599 71.2833 C85 C
3 STON/O2. 3101282 7.9250 S
4 113803 53.1000 C123 S
5 373450 8.0500 S
6 330877 8.4583 Q
數(shù)據(jù)預(yù)處理
# 在test數(shù)據(jù)集中增加Survieved列
test.survived <- data.frame(Survived = rep("None", nrow(test)),test[,] )
# 將test 和 train數(shù)據(jù)集聚合
data.combined <- rbind(train,test.survived)
data.combined$Survived <- as.factor(data.combined$Survived)
data.combined$Pclass <- as.factor(data.combined$Pclass)
合并后的數(shù)據(jù)有生存情況(Survived)中有未知值N前域、418個(gè)(需要預(yù)測(cè)的)辕近,年齡(Age)中缺失值有263個(gè),船票費(fèi)用(Fare)中缺失值有1個(gè)匿垄。
目前移宅,我們已經(jīng)對(duì)test,train數(shù)據(jù)集有初步的了解椿疗,其中訓(xùn)練集891個(gè)漏峰,測(cè)試集418個(gè)。 我們的目標(biāo)是要預(yù)測(cè)生存情況(Survived)——因變量届榄,而可供使用的自變量11個(gè)浅乔,如下圖所示。
特征工程
假設(shè)船艙等級(jí)越高铝条,幸存率越高
ggplot(train,aes(x = Pclass, y = ..count.., fill=factor(Survived))) +
geom_bar(stat = "count", position='stack') +
xlab('Plass') +
ylab('Count') +
ggtitle('How Plass impact survivor') +
scale_fill_discrete(name="Survived", breaks=c(0, 1), labels=c("Perish", "Survived")) +
geom_text(stat = "count", aes(label = ..count..), position=position_stack(vjust = 0.5)) +
theme(plot.title = element_text(hjust = 0.5), legend.position="bottom")
- 從圖中可很明顯看出船艙等級(jí)越高靖苇,幸存率越高,隨著船艙等級(jí)下降班缰,幸存率也從62.9%降到24.2%
假設(shè)乘客名字(Name)具有特征潛力
在乘客名字(Name)中贤壁,有一個(gè)非常顯著的特點(diǎn):乘客頭銜每個(gè)名字當(dāng)中都包含了具體的稱謂或者說是頭銜,將這部分信息提取出來后可以作為非常有用一個(gè)新變量埠忘,可以幫助我們預(yù)測(cè)脾拆。
# 從乘客名字中提取頭銜
data.combined$Title <- gsub('(.*, )|(\\..*)', '', data.combined$Name)
as.factor(data.combined$Title)
table(data.combined$Title)
Capt Col Don Dona Dr Jonkheer Lady Major
1 4 1 1 8 1 1 2
Master Miss Mlle Mme Mr Mrs Ms Rev
61 260 2 1 757 197 2 8
Sir the Countess
1 1
- 上面列出的Title: Miss馒索、Mlle、Mme名船、Mrs绰上、Mr、Ms渠驼、Lady蜈块、Major、Capt迷扇、Col疯趟、Sir具有明顯的性別提示,而Rev谋梭、Master信峻,Jonkheer、Don瓮床、Dona盹舞,Dr性別不可得知
data.combined[which(data.combined$Title %in% "Master"), "Sex"]
[1] "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male"
[15] "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male"
[29] "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male"
[43] "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male" "male"
[57] "male" "male" "male" "male" "male"
> data.combined[which(data.combined$Title %in% "Rev"), "Sex"]
[1] "male" "male" "male" "male" "male" "male" "male" "male"
> data.combined[which(data.combined$Title %in% "Jonkheer"), "Sex"]
[1] "male"
> data.combined[which(data.combined$Title %in% "Don"), "Sex"]
[1] "male"
> data.combined[which(data.combined$Title %in% "Dona"), "Sex"]
[1] "female"
> data.combined[which(data.combined$Title %in% "Dr"), "Sex"]
[1] "male" "male" "male" "male" "male" "male" "female" "male"
-注意到Title具有非常強(qiáng)的性別傾向,除了Dr外隘庄,各個(gè)Title都是單性別屬性踢步,換句話說,Title包含有和Sex(性別)重復(fù)的信息丑掺,有可將其替換的潛質(zhì)
性別(Sex)特征影響
ggplot(data.combined[1:891,],aes(x = Sex, y = ..count.., fill=factor(Survived))) +
geom_bar(stat = "count", position='stack') +
facet_wrap(~Pclass) +
xlab('Sex') +
ylab('Count') +
ggtitle('How Sex impact survivor') +
scale_fill_discrete(name="Survived", breaks=c(0, 1), labels=c("Perish", "Survived")) +
geom_text(stat = "count", aes(label = ..count..), position=position_stack(vjust = 0.5)) +
theme(plot.title = element_text(hjust = 0.5), legend.position="bottom")
-- 從圖中可以看出各個(gè)船艙呈現(xiàn)出一致的規(guī)律获印,女性的幸存率更高
年齡(Age)特征影響
> summary(data.combined[1:891,"Age"])
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0.42 20.12 28.00 29.70 38.00 80.00 177
ggplot(data.combined[which(!is.na(data.combined[1:891,"Age"])),], aes(x = Age, fill=factor(Survived))) + facet_wrap(~Sex + Pclass) +
geom_histogram(binwidth = 10) +
xlab("Age") +
ylab("Total Count")
> summary(data.combined[which(data.combined$Title %in% "Master"), "Age"])
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0.330 2.000 4.000 5.483 9.000 14.500 8
- 年齡列存在177個(gè)缺失值,占到train數(shù)據(jù)集的將近20%左右街州,剔除缺失值后兼丰,并不能看出其呈現(xiàn)何種明顯規(guī)律,但無意中發(fā)現(xiàn)Master的年齡分布唆缴,推斷其代表意義是:未成年男性
家庭組成人數(shù)特征影響
SibSp(兄弟姐妹及配偶的個(gè)數(shù))影響
data.combined$SibSp <- as.factor(data.combined$SibSp)
ggplot(data.combined[1:891,],aes(x = SibSp, y = ..count.., fill=factor(Survived))) +
geom_bar(stat = "count", position='stack') +
facet_wrap(~Pclass+Title) +
xlab('SibSp') +
ylab('Count') +
ggtitle('How Sibsp impact survivor') +
scale_fill_discrete(name="Survived", breaks=c(0, 1), labels=c("Perish", "Survived")) +
geom_text(stat = "count", aes(label = ..count..), position=position_stack(vjust = 0.5)) +
theme(plot.title = element_text(hjust = 0.5), legend.position="bottom")
Parch(父母或子女的個(gè)數(shù))影響
data.combined$Parch <- as.factor(data.combined$Parch)
ggplot(data.combined[1:891,],aes(x = Parch, y = ..count.., fill=factor(Survived))) +
geom_bar(stat = "count", position='stack') +
facet_wrap(~Pclass+Title) +
xlab('Parch') +
ylab('Count') +
ggtitle('How Parch impact survivor') +
scale_fill_discrete(name="Survived", breaks=c(0, 1), labels=c("Perish", "Survived")) +
geom_text(stat = "count", aes(label = ..count..), position=position_stack(vjust = 0.5)) +
theme(plot.title = element_text(hjust = 0.5), legend.position="bottom")
家庭總?cè)藬?shù)(Family.size)影響
Temp.SibSp <- c(train$SibSp, test$SibSp)
Temp.Parch <- c(train$Parch, test$Parch)
data.combined$family.size <- as.factor(Temp.SibSp + Temp.Parch + 1)
ggplot(data.combined[1:891,],aes(x = family.size, y = ..count.., fill=factor(Survived))) +
geom_bar(stat = "count", position='stack') +
facet_wrap(~Pclass+Title) +
xlab('Parch') +
ylab('Count') +
ggtitle('How Parch impact survivor') +
scale_fill_discrete(name="Survived", breaks=c(0, 1), labels=c("Perish", "Survived")) +
geom_text(stat = "count", aes(label = ..count..), position=position_stack(vjust = 0.5)) +
theme(plot.title = element_text(hjust = 0.5), legend.position="bottom")
- 總體上鳍征,家庭成員對(duì)應(yīng)的列:SibSp、Parch面徽、family.size算是弱特征值艳丛,有家庭成員的乘客更有生還的機(jī)會(huì)
船票號(hào)(Ticket)特征影響
#船票號(hào)(Ticket)是字符類型數(shù)據(jù)
> data.combined$Ticket[1:20]
[1] "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" "373450"
[6] "330877" "17463" "349909" "347742" "237736"
[11] "PP 9549" "113783" "A/5. 2151" "347082" "350406"
[16] "248706" "382652" "244373" "345763" "2649"
-- 數(shù)據(jù)很雜亂,沒有規(guī)律可尋
#提取船票號(hào)(Ticket)首字母作為Factor后統(tǒng)計(jì)
Ticket.first.char <- ifelse(data.combined$Ticket == "", " ", substr(data.combined$Ticket, 1, 1))
> unique(Ticket.first.char)
[1] "A" "P" "S" "1" "3" "2" "C" "7" "W" "4" "F" "L" "9" "6" "5" "8"
data.combined$Ticket.first.char <- as.factor(Ticket.first.char)
#羅列出購買不同Ticket的乘客的生存狀況
ggplot(data.combined[1:891,], aes(x = Ticket.first.char, fill=factor(Survived))) +
geom_bar() +
ggtitle("Survivability by ticket.first.char") +
xlab("ticket.first.char") +
ylab("Total Count") +
ylim(0,350) +
labs(fill = "Survived")
#羅列出購買不同Ticket的乘客在不同船艙的生存狀況
ggplot(data.combined[1:891,], aes(x = Ticket.first.char, fill=factor(Survived))) +
geom_bar() +
facet_wrap(~Pclass) +
ggtitle("Pclass") +
xlab("Ticket.first.char") +
ylab("Total Count") +
ylim(0,300) +
labs(fill = "Survived")
##羅列出購買不同Ticket的乘客在不同船艙的生存狀況
ggplot(data.combined[1:891,], aes(x = Ticket.first.char, fill=factor(Survived))) +
geom_bar() +
facet_wrap(~Pclass) +
ggtitle("Pclass") +
xlab("Ticket.first.char") +
ylab("Total Count") +
ylim(0,300) +
labs(fill = "Survived")
-- 總體上趟紊,船票號(hào)(Ticket)是弱特征值氮双,沒有表現(xiàn)出明顯的規(guī)律
船票費(fèi)用特征影響
##不同船票費(fèi)用乘客員生還分布情況
ggplot(data.combined[which(!is.na(data.combined[1:891,"Fare"])), ], aes(x = Fare,fill = Survived)) +
geom_histogram(binwidth = 5,position="identity") +
ggtitle("Combined Fare Distribution") +
xlab("Fare") +
ylab("Total Count") +
ylim(0,100)
# 在各船艙,Title不同的情況下霎匈,不同船票費(fèi)用乘客員生還分布情況
ggplot(data.combined[which(!is.na(data.combined[1:891,"Fare"])), ], aes(x = Fare, fill = Survived)) +
geom_histogram(binwidth = 5,position="identity") +
facet_wrap(~Pclass + Title) +
ggtitle("Pclass, Title") +
xlab("fare") +
ylab("Total Count") +
ylim(0,50) +
labs(fill = "Survived")
- 無規(guī)律可尋戴差,暫不作為特征考慮
Cabin(客艙號(hào))特征影響
str(data.combined$Cabin)
chr [1:1309] "" "C85" "" "C123" "" "" "E46" "" "" "" "G6" "C103" "" "" "" "" "" "" "" "" "" "D56" "" ...
# Cabin(客艙號(hào))是字符型
# 觀察Cabin(客艙號(hào))分布,可以看到有很多缺失值唧躲,而且分布比較雜亂
> head(data.combined$Cabin,20)
[1] "" "C85" "" "C123" "" "" "E46" "" "" "" "G6" "C103" "" ""
[15] "" "" "" "" "" ""
#填補(bǔ)缺失值
data.combined[which(data.combined$Cabin == ""), "Cabin"] <- "U"
data.combined$Cabin[1:20]
[1] "U" "C85" "U" "C123" "U" "U" "E46" "U" "U" "U" "G6" "C103" "U" "U"
[15] "U" "U" "U" "U" "U" "U"
#通過因子轉(zhuǎn)換試圖去找出分類
cabin.first.char <- as.factor(substr(data.combined$Cabin, 1, 1))
str(cabin.first.char)
levels(cabin.first.char)
[1] "A" "B" "C" "D" "E" "F" "G" "T" "U"
ggplot(data.combined[1:891,],aes(x = cabin.first.char, y = ..count.., fill=factor(Survived))) +
geom_bar(stat = "count", position='stack') +
facet_wrap(~Pclass) +
xlab('Parch') +
ylab('Count') +
ggtitle('How Cabin impact survivor') +
scale_fill_discrete(name="Survived", breaks=c(0, 1), labels=c("Perish", "Survived")) +
geom_text(stat = "count", aes(label = ..count..), position=position_stack(vjust = 0.5)) +
theme(plot.title = element_text(hjust = 0.5), legend.position="bottom")
- 缺失值較多造挽,再加上無明顯特征規(guī)律,初步判定無特征資質(zhì)
登錄港口(Embarked)特征影響
#登錄港口(Embarked):C = Cherbourg, Q = Queenstown, S = Southampton三個(gè)弄痹,適合作為Factor(因子)處理
str(data.combined$Embarked)
levels(as.factor(data.combined$Embarked))
[1] "" "C" "Q" "S"
#train數(shù)據(jù)集中有2個(gè)缺失值饭入,個(gè)數(shù)相對(duì)總數(shù)來說可忽略不計(jì)
table(data.combined[1:891,"Embarked"])
C Q S
2 168 77 644
ggplot(data.combined[1:891,],aes(x = Embarked, y = ..count.., fill=factor(Survived))) +
geom_bar(stat = "count", position='stack') +
facet_wrap(~Pclass) +
xlab('Parch') +
ylab('Count') +
ggtitle('How Embarked impact survivor') +
scale_fill_discrete(name="Survived", breaks=c(0, 1), labels=c("Perish", "Survived")) +
geom_text(stat = "count", aes(label = ..count..), position=position_stack(vjust = 0.5)) +
theme(plot.title = element_text(hjust = 0.5), legend.position="bottom")
-初步判斷無明顯特征規(guī)律,可判斷其無特征屬性
經(jīng)過對(duì)以下變量:船艙等級(jí)肛真、名字谐丢、性別、年齡蚓让、家庭組成人數(shù)乾忱、船票號(hào)、
船票費(fèi)用历极、客艙號(hào)窄瘟、登錄港口的特征影響排查,可認(rèn)為船艙等級(jí)趟卸、名字中的Title蹄葱、性別、家庭組成人數(shù)具有明顯的特征屬性锄列,其他變量沒有呈現(xiàn)明顯的特征規(guī)律图云,為避免過度擬合需要舍棄,同時(shí)名字中的Title變量有包含性別信息邻邮,如果同時(shí)將名字中的Title竣况、性別都作為自變量的話,也可能會(huì)造成過度擬合筒严,需要警惕丹泉。
模型設(shè)計(jì)
經(jīng)過對(duì)變量:船艙等級(jí)、名字鸭蛙、性別嘀掸、年齡、家庭組成人數(shù)规惰、船票號(hào)睬塌、
船票費(fèi)用、客艙號(hào)歇万、登錄港口的特征影響排查揩晴,可認(rèn)為船艙等級(jí)、名字中的Title贪磺、性別硫兰、家庭組成人數(shù)具有明顯的特征屬性,其他變量沒有呈現(xiàn)明顯的特征規(guī)律寒锚,為避免過度擬合需要舍棄劫映,同時(shí)名字中的Title變量有包含性別信息违孝,如果同時(shí)將名字中的Title、性別都作為自變量的話泳赋,也可能會(huì)造成過度擬合雌桑,需要警惕。
接下來要建立模型預(yù)測(cè)泰坦尼克號(hào)上乘客的生存狀況祖今。 在這校坑,我們使用隨機(jī)森林分類算法(The RandomForest Classification Algorithm) ,至于前期的那么多工作都是為了這一步驟服務(wù)的千诬。
#加載randomForest包
library(randomForest)
test.subset <-data.combined[1:891,]
test.subset$Title<-as.factor(test.subset$Title)
#選擇Pclass和Title兩個(gè)自變量
set.seed(1234)
forest_Pclass_Title <- randomForest(factor(Survived)~Pclass+Title,
data=test.subset,
importance=TRUE,
ntree=1000)
varImpPlot(forest_Pclass_Title)
#錯(cuò)誤率統(tǒng)計(jì)
> forest_Pclass_Title
Call:
randomForest(formula = factor(Survived) ~ Pclass + Title, data = test.subset, importance = TRUE, ntree = 1000)
Type of random forest: classification
Number of trees: 1000
No. of variables tried at each split: 1
OOB estimate of error rate: 20.76%
Confusion matrix:
0 1 class.error
0 533 16 0.0291439
1 169 173 0.4941520
#選擇Pclass耍目、Title、family.size三個(gè)自變量
set.seed(1234)
forest_Pclass_Title_family.size <- randomForest(factor(Survived)~Pclass+Title+family.size,
data=test.subset,
importance=TRUE,
ntree=1000)
varImpPlot(forest_Pclass_Title_family.size)
#可以發(fā)現(xiàn)擇Pclass徐绑、Title邪驮、family.size三個(gè)自變量,比但選擇Pclass傲茄、Title耕捞,準(zhǔn)確率要高出3.2%左右
> forest_Pclass_Title_family.size
Call:
randomForest(formula = factor(Survived) ~ Pclass + Title + family.size, data = test.subset, importance = TRUE, ntree = 1000)
Type of random forest: classification
Number of trees: 1000
No. of variables tried at each split: 1
OOB estimate of error rate: 17.51%
Confusion matrix:
0 1 class.error
0 485 64 0.1165756
1 92 250 0.2690058
通過上述比較,得到最優(yōu)的結(jié)果的選擇自變量是:Pclass烫幕、Title俺抽、family.size。
實(shí)驗(yàn)時(shí)较曼,我們也特地將前面我們已經(jīng)認(rèn)為無特征屬性的各自變量加入測(cè)試磷斧,而得到的結(jié)果則是導(dǎo)致總體的出錯(cuò)率增加,這里就不再贅述捷犹。
- MeanDecreaseAccuracy衡量把一個(gè)變量的取值變?yōu)殡S機(jī)數(shù)弛饭,隨機(jī)森林預(yù)測(cè)準(zhǔn)確性的降低程度。該值越大表示該變量的重要性越大
- MeanDecreaseGini通過基尼(Gini)指數(shù)計(jì)算每個(gè)變量對(duì)分類樹每個(gè)節(jié)點(diǎn)上觀測(cè)值的異質(zhì)性的影響萍歉,從而比較變量的重要性侣颂。該值越大表示該變量的重要性越大
預(yù)測(cè)
模型和自變量都確定,最后一步就是預(yù)測(cè)結(jié)果了枪孩,在這里可以把上面剛建立的模型直接應(yīng)用在測(cè)試集上憔晒。
validate_subset <- data.combined[892:1309,]
# 基于測(cè)試集進(jìn)行預(yù)測(cè)
prediction <- predict(forest_Pclass_Title_family.size,validate_subset)
# 將結(jié)果保存為數(shù)據(jù)框,按照Kaggle提交文檔的格式要求蔑舞。
solution <- data.frame(PassengerID = validate_subset$PassengerId, Survived = prediction)
# 將結(jié)果寫入文件
write.csv(solution, file = 'rf_mod_Solution1.csv', row.names = F)
得到的文件后拒担,就可以上傳Kaggle獲取自己的排名情況啦~
比賽頁面:Titanic: Machine Learning from Disaster
以下就是這次實(shí)驗(yàn)的排名結(jié)果:
- 比賽成績(jī)排名在前26%,不算是理想攻询,還有很多的進(jìn)步空間
總結(jié)
本篇文章是參考的《 Introduction to Data Science with R》教程步驟逐步的進(jìn)行从撼,完成的工作只是初步階段,后面會(huì)做以下改進(jìn)工作
- 各自變量的缺失值處理
- 交叉驗(yàn)證
- 使用其他算法建立模型預(yù)測(cè)