IIMT2641Introduction to Business Analytics Due November 7Fall 2019Assignment 4In this problem, we will practice building CART models with a continuous outcome, using the datasetStateData.csv which has data from 1970s on all fifty US states. A description of the variables in the dataset isgiven in Table 1.Variable DescriptionPopulation Population estimate of the state in 1975.Income Per capita income in the state in 1974.Illiteracy Illiteracy rates in 1970, as a percentage of the state’s population.LifeExp The life expectancy in years of residents of the state in 1970.MurderThe murder and non-negligent manslaughter rate per 100,000population in 1976.HighSchoolGrad The high-school graduation rate in the state in 1970.FrostThe mean number of days with minumum temperature belowfreezing from 1931 to 1960 in the capital or a large city of the state.Area The land area (in sqaure miles) of the state.Longitude The longitude of the center of the state.Latitude The latitude of the center of the state.RegionThe region (Northeast, South, North Central, or West)that the state belongs to.Table 1: Variables in the dataset StateData.csv.(a) Let us start by building a linear regression model. Randomly split the dataset into a training set (70%)and a test set (30%).(i) First, build a linear regression model to predict LifeExp using the following several variablesas the independent variables: Population, Murder, Frost, Income, Illiteracy, Area, andHighSchoolGrad. Use the training dataset to build the model. What is the R2 of the model代做IIMT2641 R 語言、代做R、代做代寫R 代做數(shù)據(jù) onthe test set?(ii) Now, build a linear regression model to predict LifeExp the following four variables as theindependent variables: Population, Murder, Frost, and HighSchoolGrad. Again, use thetraining dataset to build the model. What is the R2 of the model on the test set?(iii) Compare these two models. What are we achieving by removing independent variables? Whatis the equivalent procedure in a CART model?(b) Now, build a CART model to predict LifeExP using the following seven variables as the independentvariables: Population, Murder, Frost, Income, Illiteracy, Area, and HighSchoolGrad. Setthe parameter minbucket to be 5. Make sure that you are building a regression tree, and not aclassification tree, by setting the argument method to “anova” instead of “class”.IIMT2641Introduction to Business AnalyticFall 2019Assignment 4(i) Plot the trees. Which of the independent variables appear in the tree? Do you find the linearregression model or the CART model easier to interpret?(ii) Compute the predicted life expectancies for the test dataset using the CART model, and calculatethe R2 of the predictions.(c) Now, build a random forest model to predict LifeExP using the same severn variables as the inde?pendent variables. Set the parameter nodesize to 5. Compute the predicted life expectancies forthe test dataset using the random forest model, and calculate the R2 of the predictions.(d) Which of the four models you built do you think is the best model, if out-of-sample accuracy is themost important. How about if interpretability is the most important?轉(zhuǎn)自:http://www.3daixie.com/contents/11/3444.html
講解:IIMT2641 R 尖阔、R绷耍、RSQL|Prolog
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
- 文/潘曉璐 我一進店門淑蔚,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人愕撰,你說我怎么就攤上這事刹衫。” “怎么了搞挣?”我有些...
- 文/不壞的土叔 我叫張陵带迟,是天一觀的道長。 經(jīng)常有香客問我囱桨,道長仓犬,這世上最難降的妖魔是什么? 我笑而不...
- 正文 為了忘掉前任舍肠,我火速辦了婚禮搀继,結(jié)果婚禮上窘面,老公的妹妹穿的比我還像新娘。我一直安慰自己律歼,他們只是感情好民镜,可當我...
- 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著险毁,像睡著了一般制圈。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上畔况,一...
- 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼橡羞!你這毒婦竟也來了眯停?” 一聲冷哼從身側(cè)響起,我...
- 正文 年R本政府宣布,位于F島的核電站仇冯,受9級特大地震影響之宿,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜苛坚,卻給世界環(huán)境...
- 文/蒙蒙 一比被、第九天 我趴在偏房一處隱蔽的房頂上張望色难。 院中可真熱鬧,春花似錦等缀、人聲如沸枷莉。這莊子的主人今日做“春日...
- 文/蒼蘭香墨 我抬頭看了看天上的太陽笤妙。三九已至,卻和暖如春噪裕,著一層夾襖步出監(jiān)牢的瞬間蹲盘,已是汗流浹背。 一陣腳步聲響...
推薦閱讀更多精彩內(nèi)容
- The Inner Game of Tennis W Timothy Gallwey Jonathan Cape ...
- 這幾天接連看奧運醇蝴,加上正值暑假閑居在家,不免產(chǎn)生一些想法毒姨。(其實像我這么個愛胡思亂想的人想法真是多的不要不要的) ...