Lab 6: Parameter EstimationEpid 633Scenario & SetupA flu-like illness has been spreading through Stardew Valley County. You have been asked to develop a predictive model for the disease, which the Stardew Valley health authorities will use to estimate several important parameters that can help in designing an intervention:the infectious period of the diseasethe reporting fraction (i.e. what fraction of cases are reported)the transmission rate and/or basic reproduction number (R0)In addition, they are hoping that your model will be able to successfully forecast the number of cases there will be in the upcoming weeks.Data setHere’s the data set that we’ll be working with–weekly data on the number of currently ill individuals:times = c(0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84, 91, 98)data = c(97, 271, 860, 1995, 4419, 6549, 6321, 4763, 2571, 1385, 615, 302, 159, 72, 34)1) SIR Model SetupStart by setting up an SIR model, using the equations:In this case, we’ll take S, I, and R to be the fraction of the population that’s susceptible, infectious, and recovered. The measurement equation we’ll be using is:y=kNIwhere N is the total population size and k is the reporting fraction. To break down this equation, note that I is the fraction of the population currently infected, so NI is the total number of actual people currently infected. Then y=kNI represents the total number of observed cases currently infected.Code up the model above, noting that:Stardew Valley County has a total population of N=200000. We will treat this as a fixed (i.e. known, not estimated) parameter, so you can just code this as a number (i.e. it’s not part of the parameter vector that you’ll use for estimation).At t=0, approximiately 200 people were infected, and no one had recovered yet. This means that the initial conditions were that 99.9% of the population was susceptible, and 0.1% was infected (i.e. x0 = c(0.999,0.001,0) for S, I, and R respectively).For starting parameters, we can take β=0.4. So far, the infectious period appears to be roughly 4 days, so we can start with a guess of γ=1/4=0.25. For the reporting fraction, let’s start by supposing we have perfect reporting (i.e. every case is reported), so that k=1.Now, plot your model output (y) from 0 to 100 days, and plot the data on the same plot (use a line for your model and points/dots for the data). How well does the model match the data using our initial guesses for the parameters? Do you notice any issues with the model fit currently?2) Setting up the likelihoodNext, we need to make a function that will evaluate the goodness-of-fit for the model—in this case, the negative log likelihood. We will pass this function to the optimizer, optim, which will use our function to find the best-fit parameter estimates (i.e. the parameter values that minimize our function).Code up a likelihood function for the SIR model, using supposing that your data is taken from a Poisson distribution with rate parameter y (I’ve included the Poisson likelihood form in the template code below, but you can try deriving it yourself if you want! It’s just like the Gaussian likelihood we did on Monday, just use a Poisson distribution instead of a normal distribution). Here’s the overall structure that the likelihood function will take (note that this is just a template! You need to adapt this for your own code, depending on how you name things, etc.):CalcNLL = function(params, ODEfunction, x0, times, data){ # The function will calculate the negative log likelihood---to do that, we need: # - the parameters (this is what optim will fiddle with to find the best fit) # # - the ODE function for the SIR model you wrote in part 1). Optim will use # to simulate the model so it can compare the model to the data # # - to simulate the model with SIRode, we also need the initial conditions (x0) # and the time points (times) # # - lastly, we need to pass the data into NLL, so we can compare the model and # data # Since we dont want to include negative values for the parameters, start by taking the absolute value of whatever the current parameter values are (note this is a bit of a silly trick, ask me more if youre confused) params = abs(params) # Simulate model xcurr = ode(x0, times, ODEfunction, params) # replace this with whatever ODE code youre using to run your model! # Measurement equati代寫Epid 633、代做R編程語言矩父、Data set代寫锉桑、on y = k*N*I # replace k, N, and I with whatever they are in your code, e.g. params[1], xcurr[,3], etc. # Negative Log Likelihood (NLL) NLL = sum(y) - sum(data*log(y)) # Poisson ML # note this is a slightly shortened version--theres an additive constant term missing but it shouldnt alter the optimal estimates. Alternatively, can do: # NLL = -sum(log(dpois(round(data),round(y)))) # the round is b/c Poisson is for (integer) count data # return(NLL) }Evaluate your likelihood function for a couple of values—try your default parameters, and then try taking β=0.42 and β=0.38 (leaving the other parameters equal to the defaults). Based on this, if you were optim, which direction would you move β to improve the fit? Does this match with what you saw in your initial model fit in Part 1?3) Parameter EstimationOkay, time to run the optimizer! Estimate your model parameters from the data set provided. While you’re at it, also pull the final, best-fit negative log likelihood value. The function optim takes the following inputs:starting parameters the first input for optim will be the starting values for the model parameterscost function - in our case, this is our likelihood-calculator we made abovearguments for the cost function - all the extra arguments that the cost function takes (i.e. everything else the cost function needs, like data,times,your ODE function, etc.)So the structure is: optim(starting parameters, cost function, arguments for cost function). Here’s a template for how to set up your optimizer:### First, run optim ###results = optim(initialparams, fn = CalcNLL, ODEfunction = SIRode, times = times, data = data, x0 = x0) # Ask me if youre confused about why we have times = times, data = data, etc.!### Pull out the parameter estimates ###paramests = results$par### Pull out the final cost function value (-log likelihood) ###negLL = results$valuePlot your model and data on the same plot (like you did in Part 1). How does your model fit look now? How did your parameters change from your initial guess? What are the values you estimated for each of your parameters (β, γ, and k)? What implications might your estimate of k have for intervention efforts?4) ForecastingLet’s consider the case where you are attempting to fit and forecast an ongoing epidemic (i.e. with incomplete data). Truncate your data to only include the first five data points, then re-fit the model parameters.How do your parameter estimates change?Simulate your model for the full time (0 to 100 days) and plot your model predictions along with the complete data set. How well did you predict the dynamics of the epidemic?Now re-do the forecast, but suppose that we only have the first four data points of the epidemic—how do your estimates and predictions change?5) Model Comparison and the Akaike Information Criterion (AIC)Lastly, let’s compare the SIR model we’ve been using to a slightly more complex model, the SEIR model:Here, E represents the fraction of the population who are infected but not yet transmissible (i.e. latently infected).Estimate the parameters using this model and the complete data set, supposing that:The initial conditions for S, E, I, and R are: x0 = c(0.9985,0.001,0.0015,0)The initial parameter guesses are params = c(beta=0.4,alpha = 1, gamma=0.25, kappa=1)You can use the same measurement equation as before (y=kNI). Does the fit look better/different than what you saw with the SIR model in Part 3? How are the parameter estimates different with this model compared to the SIR model?Calculate the AIC for both models and compare. Which model has a lower (better) AIC? How different are the two AICs? Recall that the AIC is given by:2(?LL+nP)where ?LL is the negative log likelihood and nP is the number of parameters to be estimated (i.e. 3 for the SIR model and 4 for the SEIR model).2π) Extra problems (just for fun—these are not required and don’t need to be turned in!)Try out alternative cost functions, e.g. using dpois for the Poisson likelihood, or using least squares. Do you notice differences in fit? What about in how long it takes to generate the parameter estimates?Try making a heat map of the likelihood as a function of β and γ (you could try out a fancier plotting package like ggplot2 or plotly for this). How does the shape of the likelihood change as you reduce the amount of data you have?轉(zhuǎn)自:http://www.daixie0.com/contents/18/4347.html
講解:Epid 633、R迟螺、Data set冲秽、RPython|Prolog
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
- 文/潘曉璐 我一進(jìn)店門涌韩,熙熙樓的掌柜王于貴愁眉苦臉地迎上來瘪匿,“玉大人莹菱,你說我怎么就攤上這事鞭呕。” “怎么了篇恒?”我有些...
- 文/不壞的土叔 我叫張陵扶檐,是天一觀的道長。 經(jīng)常有香客問我胁艰,道長款筑,這世上最難降的妖魔是什么? 我笑而不...
- 正文 為了忘掉前任蝗茁,我火速辦了婚禮醋虏,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘哮翘。我一直安慰自己,他們只是感情好毛秘,可當(dāng)我...
- 文/花漫 我一把揭開白布饭寺。 她就那樣靜靜地躺著阻课,像睡著了一般。 火紅的嫁衣襯著肌膚如雪艰匙。 梳的紋絲不亂的頭發(fā)上限煞,一...
- 文/蒼蘭香墨 我猛地睜開眼糖埋,長吁一口氣:“原來是場噩夢啊……” “哼宣吱!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起瞳别,我...
- 序言:老撾萬榮一對情侶失蹤征候,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后祟敛,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體疤坝,經(jīng)...
- 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
- 正文 我和宋清朗相戀三年馆铁,在試婚紗的時候發(fā)現(xiàn)自己被綠了跑揉。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
- 正文 年R本政府宣布,位于F島的核電站乍构,受9級特大地震影響甜无,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜哥遮,卻給世界環(huán)境...
- 文/蒙蒙 一岂丘、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧眠饮,春花似錦奥帘、人聲如沸。這莊子的主人今日做“春日...
- 文/蒼蘭香墨 我抬頭看了看天上的太陽松蒜。三九已至,卻和暖如春已旧,著一層夾襖步出監(jiān)牢的瞬間秸苗,已是汗流浹背。 一陣腳步聲響...
- 正文 我出身青樓秸讹,卻偏偏與公主長得像檀咙,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子嗦枢,可洞房花燭夜當(dāng)晚...
推薦閱讀更多精彩內(nèi)容
- 當(dāng)今社會弛作,人們普遍越來越重視孩子的教育涕蜂,從懷孕時的胎教,到哺乳期的熏陶映琳,再到學(xué)齡前的啟蒙机隙,可謂處處用心,絲絲入扣萨西,...