講解:Stat5401逗宜、R雄右、data、RMatlab|Java

Stat5401 Midterm Exam II - Due April 8th, 2020Exam Instruction:? There are a total number of 3 questions (25 points in total). Please checkif you have answered all the questions.? Please attach all R codes in your solution. You can use R notebook toorganize your results.? Please organize your answers in a single pdf file and submit it throughCanvas.? The exam is due at 11:59 pm on April 8, 2020 (CDT). Please try to submityour work at least a few minutes earlier than the deadline to avoid delaysdue to technical issues.? This is an open book exam. You are allowed to use your notes, book, Rhelp files, academic papers, online tutorials, etc.? You are NOT allowed to discuss the exam with anyone else.? General questions about the exam should be asked on Canvas discussionsboard ‘Midterm 2 related questions and clarifications’. Please askquestions as early as possible. I may not be able to answer last-minutequestions.? Other questions should be directed to me at lixx1766@umn.edu. If youare writing emails to me, please use your university email, i.e., ending with@umn.edu, etc. You can also send email directly through Canvas.? All answers must be written in your own words.? Updates are colored in red.1Question 1(5 points)Consider the following underlying linear regression modelyi = β1Xi1 + β2Xi2 + ?i,with the standard assumption that E(?i) = 0, V ar(?i) = σ2, and Cov(?i, ?j ) = 0for i 6= j. Note that we don’t include the intercept in this question.Suppose that you observe.Answer the following questions using math, i.e., by hand:(a) (1 points) Suppose that we fit the model:yi = β1Xi1 + β2Xi2 + ?i.Write down the design matrix X with the given X1 and X2.(b) (2 points) Fit the model yi = β1Xi1 + ?i, and let β?1 be the least squareestimator for β1.? Derive the mean and variance for β?1.? Is β?1 an unbiased estimator for β1?(c) (2 points) Suppose now that the true underlying model isyi = β1Xi1 + β2Xi3 + ?i,and you observeSuppose that you fit the modelyi = β1Xi1 + ?ito estimate β1.? Derive the mean and variance of the least square estimator for β1.? Is it an unbiased estimator?2Question 2 (Total 8 points)Download the data Q2.csv from Canvas. This is a simulated dataset motivatedby an example in the book Machine Learning with R by Brett Lantz.The response variable in the dataset is? charges = medical cost (in dollar) billed by insurance companyThe 6 covariates are? gender = Gender of the primary beneficiary, ‘f’ if female, ‘m’ if male.? age = Age of the primary beneficiary.? bmi = Body mass index? smoker = Smoker or not, ‘yes’ for smoker, ’no’ for non-smoker? children = Number of dependents, treated as a continuous/numeric covariatein this problem.? region = Residential area in the US.(a) Build a linear model by regressing charges on all the 6 covariates. Answerthe following questions.(i) Which effects are significant at α = 0.05, and what is the direction ofthe effects? Is there a relationship between age and charges?(ii) Find a 95% confidence interval for the linear coefficient for bmi.(iii) What’s the R2 and adjusted R2?(b) Build a reduced model by regressing charges on age, bmi and smoker.Compare the this model with the full model fitted in part (a) using an Ftest. According to the F test, does the model in part (a) fit the modelsignificantly better than that in part (b)?QuesStat5401作業(yè)代做纺讲、代寫R編程語言作業(yè)擂仍、data課程作業(yè)代寫、R程序作業(yè)調(diào)試 調(diào)試Matlab程序|幫做Java程tion 3 (Total 12 points)In class, we mentioned that there are many variable selection methods available.In this example, we study additional performance metrics, and use simulationto verify their effectiveness. We will also study and try the stepwise variableselection for multiple linear regression.(a) We have learned the adjusted R2 as a metric for the model fit. In thisquestion, we compare different models using the adjusted R2.Suppose the predictor is generated byset.seed(2020)n=200x=rnorm(n)3Remark: To make our results comparable, please use set.seed(2020)when generating x.(i) Let the underlying model is generated byeps=rnorm(n)y=x+x^2+x^3+epsWhat is the underlying model? How many covariates are there inthe underlying model? Please specify the covariates and true linearcoefficients.(ii) Fit 6 different models: yi = β0 +Ppj=1 βjXji + ?i for p = 1, 2, 3, 4, 5, 6.These models are polynomials of different orders. Calculate the adjustedR2for each of them, and draw a plot showing the adjustedR2. (x-axis: p, y-axis: adjusted R2) Does the correct model have thelargest adjusted R2?(Hint: You can first create a data matrixX = cbind(x,x^2,x^3,x^4,x^5,x^6)and then use a for loop to run the 6 regression models, in order tosimplify codes. Also, try summary(model)$adj.r.squared to extractthe adjusted R2for a fitted model. )(iii) Instead of using the adjusted R2, there are other performance criteria.Here, we consider the Akaike Information Criterion (AIC) andBayesian Information Criterion (BIC). Read the document at the linkhttps://daviddalpiaz.github.io/appliedstats/variable-selection-and-model-building.html.Alternatively, you can also read page 385-386 of the textbook on ‘selectingpredictor variable from a large set’ and page 705 for the defi-nition of AIC and BIC.For this question, write down the definition of AIC and BIC in termsof Residual Sum of Squares (RSS), n and p.(iv) In R, AIC and BIC can be computed using the functions AIC and BIC,respectively. Replace the adjusted R2 by AIC and BIC in part (iii)and plot the results. Does the correct model have the smallest AICand BIC?(v) Repeat the simulation in (i), (iii) and (iv) for 100 times. You will needto keep the same x while generating new random eps each time.Each time, use the adjusted R2(the largest one), AIC and BIC (smallestone) to select the model. Take record of the model selected for eachsimulation (i.e., take record of the selected p).Report the frequency that the adjusted R2, AIC, BIC correctly selectthe true model among the 100 simulations. For this problem, whichmetric selects the model best? For the other metrics, do they tend toselect more covariates or fewer covariates than the correct one?4(b) For multiple linear regression, stepwise selection (including forward search,backward search, and both directions) is usually used for model selection.Read Section 16.2 of the documenthttps://daviddalpiaz.github.io/appliedstats/variable-selection-and-model-building.html and answer the following questions(i) Use about two or three sentences to describe stepwise variable selectionmethods.(ii) Download the data Q3.csv and regress Y on X1 - X20. (Hint: trylm(Y~.,data=Q3)). Use the function step to select variables (use thedefault arguments without changing its arguments like k or direction).Report the selected variables.5轉(zhuǎn)自:http://www.6daixie.com/contents/18/5068.html

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末熬甚,一起剝皮案震驚了整個(gè)濱河市逢渔,隨后出現(xiàn)的幾起案子,更是在濱河造成了極大的恐慌乡括,老刑警劉巖肃廓,帶你破解...
    沈念sama閱讀 218,525評(píng)論 6 507
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件智厌,死亡現(xiàn)場(chǎng)離奇詭異,居然都是意外死亡盲赊,警方通過查閱死者的電腦和手機(jī)铣鹏,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,203評(píng)論 3 395
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來哀蘑,“玉大人诚卸,你說我怎么就攤上這事』媲ǎ” “怎么了合溺?”我有些...
    開封第一講書人閱讀 164,862評(píng)論 0 354
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)脊髓。 經(jīng)常有香客問我辫愉,道長(zhǎng),這世上最難降的妖魔是什么将硝? 我笑而不...
    開封第一講書人閱讀 58,728評(píng)論 1 294
  • 正文 為了忘掉前任恭朗,我火速辦了婚禮,結(jié)果婚禮上依疼,老公的妹妹穿的比我還像新娘痰腮。我一直安慰自己,他們只是感情好律罢,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,743評(píng)論 6 392
  • 文/花漫 我一把揭開白布膀值。 她就那樣靜靜地躺著,像睡著了一般误辑。 火紅的嫁衣襯著肌膚如雪沧踏。 梳的紋絲不亂的頭發(fā)上,一...
    開封第一講書人閱讀 51,590評(píng)論 1 305
  • 那天巾钉,我揣著相機(jī)與錄音翘狱,去河邊找鬼。 笑死砰苍,一個(gè)胖子當(dāng)著我的面吹牛潦匈,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播赚导,決...
    沈念sama閱讀 40,330評(píng)論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼茬缩,長(zhǎng)吁一口氣:“原來是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來了吼旧?” 一聲冷哼從身側(cè)響起凰锡,我...
    開封第一講書人閱讀 39,244評(píng)論 0 276
  • 序言:老撾萬榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個(gè)月后寡夹,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體处面,經(jīng)...
    沈念sama閱讀 45,693評(píng)論 1 314
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,885評(píng)論 3 336
  • 正文 我和宋清朗相戀三年菩掏,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了魂角。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 40,001評(píng)論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡智绸,死狀恐怖野揪,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情瞧栗,我是刑警寧澤斯稳,帶...
    沈念sama閱讀 35,723評(píng)論 5 346
  • 正文 年R本政府宣布,位于F島的核電站迹恐,受9級(jí)特大地震影響挣惰,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜殴边,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,343評(píng)論 3 330
  • 文/蒙蒙 一憎茂、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧锤岸,春花似錦竖幔、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,919評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)。三九已至蛋铆,卻和暖如春馋评,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背刺啦。 一陣腳步聲響...
    開封第一講書人閱讀 33,042評(píng)論 1 270
  • 我被黑心中介騙來泰國(guó)打工栗恩, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人洪燥。 一個(gè)月前我還...
    沈念sama閱讀 48,191評(píng)論 3 370
  • 正文 我出身青樓,卻偏偏與公主長(zhǎng)得像乳乌,于是被迫代替她去往敵國(guó)和親捧韵。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,955評(píng)論 2 355

推薦閱讀更多精彩內(nèi)容

  • Stat5401 Midterm Exam II - Due April 8th, 2020Exam Instru...
    shifumeng閱讀 116評(píng)論 0 0
  • 人生總是在經(jīng)歷各種各樣的選擇,從我們出生的那一刻起针炉,我們就在選擇挠他,去或留,生或死篡帕,人生的每一刻殖侵,每一瞬無時(shí)無刻都在...
    沐沐521閱讀 864評(píng)論 0 1
  • 世界變老了 我也長(zhǎng)大了 持續(xù)對(duì)愛好的熱情 保持童心未泯
    石蛋閱讀 154評(píng)論 0 0
  • 在電話邊等待 很久沒有這樣了 凌晨三點(diǎn) 喝了咖啡 沒有困意 很多事都還是凌亂的 我把整理的時(shí)間一拖再拖 在清冷的衛(wèi)...
    有童年沒青春閱讀 159評(píng)論 0 0
  • String不必說,是redis最常用的數(shù)據(jù)類型镰烧,就是普通的鍵值存儲(chǔ)拢军。 List的使用場(chǎng)景,經(jīng)常用于列表式存儲(chǔ)怔鳖,且...
    北你妹的風(fēng)閱讀 352評(píng)論 0 0