斷點(diǎn)回歸

基本設(shè)定

x_{i}是驅(qū)動(dòng)變量running variable,可以和結(jié)果變量有關(guān)也可以無(wú)關(guān)。

c是斷點(diǎn)(cutoff )

D_{i}是處理變量,處理變量完全依賴于驅(qū)動(dòng)變量:

D_{i}=\left\{\begin{array}{ll}{1} & {\text { if } x_{i}<c} \\ {0} & {\text { if } x_{i} \geq c}\end{array}\right.

斷點(diǎn)回歸在斷點(diǎn)鄰域處,樣本是否被處理令花,仿佛被“上帝之手”給控制,成為一個(gè)準(zhǔn)實(shí)驗(yàn)凉倚。

局部平均處理效應(yīng)(local average treatment effect,LATE)

\begin{aligned} \mathrm{LATE} & \equiv \mathrm{E}\left(y_{1 i}-y_{0 i} | x=c\right) \\ &=\mathrm{E}\left(y_{1 i} | x=c\right)-\mathrm{E}\left(y_{0 i} | x=c\right) \\ &=\lim _{x \rightarrow c+} \mathrm{E}\left(y_{1 i} | x\right)-\lim _{x \rightarrow c-} \mathrm{E}\left(y_{0 i} | x\right) \end{aligned}

設(shè)定treat和not treat不同的截距(D斜率[\gamma\left(x_{i}-c\right) D_{i}]兼都;其中截距\delta就是LATE

y_{i}=\alpha+\beta\left(x_{i}-c\right)+\delta D_{i}+\gamma\left(x_{i}-c\right) D_{i}+\varepsilon_{i}

基本假設(shè)
  1. 斷點(diǎn)假設(shè):斷點(diǎn)處個(gè)體被分配的概率存在跳躍
  2. 連續(xù)性假設(shè):結(jié)果變量與驅(qū)動(dòng)變量之間的關(guān)系在所有點(diǎn)都連續(xù)的
  3. 局部隨機(jī)化假設(shè):
    \left(Y_{1 i}, Y_{0 i}\right) \perp D_{i} | X_{i} \in \delta\left(x_{0}\right)
  4. 獨(dú)立性假設(shè):潛在結(jié)果和干預(yù)在斷點(diǎn)處獨(dú)立于驅(qū)動(dòng)變量X
    \left(Y_{1 i}, Y_{0 i}\right), D_{t i}(x), D_{0 i}(x) \perp X_{i}, X_{i} \in \delta\left(x_{0}\right)

精確斷點(diǎn)回歸

定義

在斷點(diǎn)處,個(gè)體得到處理的概率從0跳躍到1

存在兩個(gè)問(wèn)題
  • 如果回歸函數(shù)包含高次項(xiàng)稽寒,會(huì)導(dǎo)致遺漏變量偏差
  • 斷點(diǎn)回歸是局部的隨機(jī)實(shí)驗(yàn)
兩個(gè)解決方法
  1. 加入高次項(xiàng)扮碧。

2.限定x的取值范圍 (c-h,c+h);這里的h就是帶寬杏糙。

得到下式:
\begin{aligned} y_{i}=& \alpha+\beta_{1}\left(x_{i}-c\right)+\delta D_{i}+\gamma_{1}\left(x_{i}-c\right) D_{i} \\ &+\beta_{2}\left(x_{i}-c\right)^{2}+\gamma_{2}\left(x_{i}-c\right)^{2} D_{i}+\varepsilon_{i} \quad(c-h<x<c+h) \end{aligned}

其中\delta為對(duì)LATE的估計(jì)量慎王,可以使用穩(wěn)健標(biāo)準(zhǔn)誤克服異方差。

令人頭疼的最優(yōu)帶寬——如何求解

h小宏侍,也許會(huì)精確赖淤,但由于點(diǎn)過(guò)少可能方差會(huì)變大
h大,方差也許會(huì)變小谅河,但包含了過(guò)多離x=c較遠(yuǎn)的點(diǎn)導(dǎo)致偏差變大

現(xiàn)在一般流行使用非參數(shù)的方法求最優(yōu)帶寬

\min _{h} \mathrm{E}\left\{\left[\hat{m}_{1}(c)-m_{1}(c)\right]^{2}+\left[\hat{m}_{0}(c)-m_{0}(c)\right]^{2}\right\}

其中咱旱,m_{1}(x) \equiv \mathrm{E}\left(y_{1} | x\right), m_{0}(x) \equiv \mathrm{E}\left(y_{0} | x\right)\\\delta=m_{1}(c)-m_{0}(c),\hat{\delta}=\hat{m}_{1}(c)-\hat{m}_{0}(c)

核函數(shù)求解

\min _{ | \alpha, \beta, \delta, y \}} \sum_{i=1}^{n} K\left[\left(x_{i}-c\right) / h\right]\left[y_{i}-\alpha-\beta\left(x_{i}-c\right)-\delta D_{i}-\gamma\left(x_{i}-c\right) D_{i}\right]^{2}

其中K(\cdot)是核函數(shù)(如三角核)确丢。后面一部分就是一個(gè)殘差;而前面的中括號(hào)就是權(quán)重吐限。局部線性回歸鲜侥,在小臨域內(nèi)進(jìn)行加權(quán)最小二乘,權(quán)重由核函數(shù)決定诸典,離c約近權(quán)重越大描函。

協(xié)變量的選擇

影響y的其他協(xié)變量,好處在于減少擾動(dòng)項(xiàng)方差狐粱。壞處

  • 如果內(nèi)生會(huì)干擾估計(jì)
  • 如果協(xié)變量也出現(xiàn)跳躍舀寓,則直接出現(xiàn)偏誤,因此通常需要驗(yàn)證協(xié)變量的條件密度函數(shù)是否在斷點(diǎn)跳躍脑奠。
匯報(bào)包括

(1)三角核與矩形核的局部線性回歸結(jié)果
(2)匯報(bào)不同帶寬的結(jié)果基公,最優(yōu)帶寬幅慌,二分之一帶寬宋欺,兩倍
(3)匯報(bào)協(xié)變量和不包含協(xié)變量的情形
(4)檢驗(yàn)?zāi)P驮O(shè)定檢驗(yàn),檢驗(yàn)分組變量與協(xié)變量的條件密度是否在斷點(diǎn)處連續(xù)

模糊斷點(diǎn)回歸

\begin{array}{l}{E(y | x)=E\left(y_{0} | x\right)+E\left[D\left(y_{1}-y_{0}\right) | x\right]} \\ {\quad=E\left(y_{0} | x\right)+E(D | x) \cdot E\left[\left(y_{1}-y_{0}\right) | x\right]}\end{array}

平均處理效應(yīng)
L A T E \equiv E\left[\left(y_{1}-y_{0}\right) | x=c\right]=\frac{\lim _{x \downarrow c} E(y | x)-\lim _{x \uparrow c} E(y | x)}{\lim _{x \downarrow c} E(D | x)-\lim _{x \uparrow c} E(D | x)}

RDD的一般步驟

(1)圖形分析胰伍,

  • Y和X關(guān)系圖齿诞,Y和X有沒(méi)有斷點(diǎn)(rdplot),一般是每個(gè)帶寬之間的結(jié)果變量取一個(gè)平均值骂租。
  • 協(xié)變量和X關(guān)系圖祷杈,協(xié)變量和X有沒(méi)有斷點(diǎn)。
  • 驅(qū)動(dòng)變量X的分布圖渗饮,看其在驅(qū)動(dòng)變量左右有沒(méi)有明顯的跳躍但汞。

(2)因果效應(yīng)估計(jì)

  • 邊界非參數(shù)回歸(比較少用)
  • 局部線性回歸
    *局部多項(xiàng)式回歸

(3)穩(wěn)健性檢驗(yàn):協(xié)變量連續(xù)性檢驗(yàn)(對(duì)每個(gè)協(xié)變量做一下斷點(diǎn)回歸)、參考變量分布連續(xù)性檢驗(yàn)(McCrary)互站、偽斷點(diǎn)回歸(左右?guī)挼闹虚g)私蕾、帶寬敏感性檢驗(yàn)(換不同的帶寬)

軟件實(shí)現(xiàn)

*斷點(diǎn)圖
rdplot depvar runvar [if] [in] [, c(cutoff) p(pvalue) kernel(kernelfn)]
*最優(yōu)帶寬選擇
rdbwselect depvar runvar [if] [in] [, c(cutoff) p(pvalue) q(qvalue) deriv(dvalue) fuzzy(fuzzyvar [sharpbw]) covs(covars) kernel(kernelfn) weights(weightsvar)
                   bwselect(bwmethod) scaleregul(scaleregulvalue) vce(vcemethod) all]
/*
 c(cutoff) specifies the RD cutoff.  The default is c(0).

    p(pvalue) specifies the order of the local polynomial used to construct the point estimator.  The default is p(1) (local linear regression).

    q(qvalue) specifies the order of the local polynomial used to construct the bias correction.  The default is q(2) (local quadratic regression).

    deriv(dvalue) specifies the order of the derivative of the regression functions to be estimated.  The default is deriv(0) (sharp RD, or fuzzy RD if fuzzy() is also
        specified).  Setting deriv(1) results in estimation of a kink RD design (up to scale) or a fuzzy kink RD if fuzzy() is also specified.

    fuzzy(fuzzyvar [sharpbw]) specifies the treatment status variable used to implement fuzzy RD estimation (or fuzzy kink RD if deriv(1) is also specified).  The
        default is sharp RD design.  If the sharpbw option is set, the fuzzy RD estimation is performed using a bandwidth selection procedure for the sharp RD model.
        This option is automatically selected if there is perfect compliance at either side of the threshold.

    covs(covars) specifies additional covariates to be used for estimation and inference.

    kernel(kernelfn) specifies the kernel function used to construct the local polynomial estimators.  kernelfn may be triangular, epanechnikov, or uniform.  The
        default is kernel(triangular).

*/
*斷點(diǎn)回歸估計(jì)
rdrobust depvar runvar [if] [in] [, c(cutoff) p(pvalue) q(qvalue) deriv(dvalue) fuzzy(fuzzyvar [sharpbw]) covs(covars) kernel(kernelfn) weights(weightsvar)
                h(hvalueL hvalueR) b(bvalueL bvalueR) rho(rhovalue) scalepar(scaleparvalue) bwselect(bwmethod) scaleregul(scaleregulvalue) vce(vcemethod) level(level)
                all]

/*
c(cutoff) specifies the RD cutoff.  The default is c(0).

    p(pvalue) specifies the order of the local polynomial used to construct the point estimator.  The default is p(1) (local linear regression).

    q(qvalue) specifies the order of the local polynomial used to construct the bias correction.  The default is q(2) (local quadratic regression).

    deriv(dvalue) specifies the order of the derivative of the regression functions to be estimated.  The default is deriv(0) (sharp RD, or fuzzy RD if fuzzy() is also
        specified).  Setting deriv(1) results in estimation of a kink RD design (up to scale), or fuzzy kink RD if fuzzy() is also specified.

    fuzzy(fuzzyvar [sharpbw]) specifies the treatment status variable used to implement fuzzy RD estimation (or fuzzy kink RD if deriv(1) is also specified).  The
        default is sharp RD design.  If the sharpbw option is set, the fuzzy RD estimation is performed using a bandwidth selection procedure for the sharp RD model.
        This option is automatically selected if there is perfect compliance at either side of the threshold.

    covs(covars) specifies additional covariates to be used for estimation and inference.

    kernel(kernelfn) specifies the kernel function used to construct the local polynomial estimators.  kernelfn may be triangular, epanechnikov, or uniform.  The
        default is kernel(triangular).

    weights(weightsvar) specifies the variable used for optional weighting of the estimation procedure.  The unit-specific weights multiply the kernel function.

    h(hvalueL hvalueR) specifies the main bandwidth, h, to be used on the left and on the right of the cutoff, respectively.  If only one value is specified, then this
        value is used on both sides.  If not specified, the bandwidth(s) h is computed by the companion command rdbwselect.

    b(bvalueL bvalueR) specifies the bias bandwidth, b, to be used on the left and on the right of the cutoff, respectively.  If only one value is specified, then this
        value is used on both sides.  If not specified, bandwidth(s) b is computed by the companion command rdbwselect.

    rho(rhovalue) specifies the value of rho so that the bias bandwidth, b, equals b=h/rho.  The default is rho(1) if h is specified but b is not.

    scalepar(scaleparvalue) specifies the scaling factor for the RD parameter of interest.  This option is useful when the population parameter of interest involves a
        known multiplicative factor (for example, sharp kink RD).  The default is scalepar(1) (no scaling).

    bwselect(bwmethod) specifies the bandwidth selection procedure to be used.  By default, it computes both h and b, unless rho is specified, in which case it
        computes only the h and sets b=h/rho.  For details on implementation, see Calonico, Cattaneo, and Titiunik (2014b); Calonico, Cattaneo, and Farrell
        (forthcoming); and Calonico et al. (2016), and the companion software articles.  bwmethod may be one of the following:

*/
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市胡桃,隨后出現(xiàn)的幾起案子踩叭,更是在濱河造成了極大的恐慌,老刑警劉巖翠胰,帶你破解...
    沈念sama閱讀 218,941評(píng)論 6 508
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件容贝,死亡現(xiàn)場(chǎng)離奇詭異,居然都是意外死亡之景,警方通過(guò)查閱死者的電腦和手機(jī)斤富,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,397評(píng)論 3 395
  • 文/潘曉璐 我一進(jìn)店門(mén),熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)锻狗,“玉大人茂缚,你說(shuō)我怎么就攤上這事戏罢。” “怎么了脚囊?”我有些...
    開(kāi)封第一講書(shū)人閱讀 165,345評(píng)論 0 356
  • 文/不壞的土叔 我叫張陵龟糕,是天一觀的道長(zhǎng)。 經(jīng)常有香客問(wèn)我悔耘,道長(zhǎng)讲岁,這世上最難降的妖魔是什么? 我笑而不...
    開(kāi)封第一講書(shū)人閱讀 58,851評(píng)論 1 295
  • 正文 為了忘掉前任衬以,我火速辦了婚禮缓艳,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘看峻。我一直安慰自己阶淘,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,868評(píng)論 6 392
  • 文/花漫 我一把揭開(kāi)白布互妓。 她就那樣靜靜地躺著溪窒,像睡著了一般。 火紅的嫁衣襯著肌膚如雪冯勉。 梳的紋絲不亂的頭發(fā)上澈蚌,一...
    開(kāi)封第一講書(shū)人閱讀 51,688評(píng)論 1 305
  • 那天,我揣著相機(jī)與錄音灼狰,去河邊找鬼宛瞄。 笑死,一個(gè)胖子當(dāng)著我的面吹牛交胚,可吹牛的內(nèi)容都是我干的份汗。 我是一名探鬼主播,決...
    沈念sama閱讀 40,414評(píng)論 3 418
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼蝴簇,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼杯活!你這毒婦竟也來(lái)了?” 一聲冷哼從身側(cè)響起军熏,我...
    開(kāi)封第一講書(shū)人閱讀 39,319評(píng)論 0 276
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤轩猩,失蹤者是張志新(化名)和其女友劉穎,沒(méi)想到半個(gè)月后荡澎,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體均践,經(jīng)...
    沈念sama閱讀 45,775評(píng)論 1 315
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,945評(píng)論 3 336
  • 正文 我和宋清朗相戀三年摩幔,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了彤委。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 40,096評(píng)論 1 350
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡或衡,死狀恐怖焦影,靈堂內(nèi)的尸體忽然破棺而出车遂,到底是詐尸還是另有隱情,我是刑警寧澤斯辰,帶...
    沈念sama閱讀 35,789評(píng)論 5 346
  • 正文 年R本政府宣布舶担,位于F島的核電站,受9級(jí)特大地震影響彬呻,放射性物質(zhì)發(fā)生泄漏衣陶。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,437評(píng)論 3 331
  • 文/蒙蒙 一闸氮、第九天 我趴在偏房一處隱蔽的房頂上張望剪况。 院中可真熱鬧,春花似錦蒲跨、人聲如沸译断。這莊子的主人今日做“春日...
    開(kāi)封第一講書(shū)人閱讀 31,993評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)孙咪。三九已至,卻和暖如春隆箩,著一層夾襖步出監(jiān)牢的瞬間该贾,已是汗流浹背羔杨。 一陣腳步聲響...
    開(kāi)封第一講書(shū)人閱讀 33,107評(píng)論 1 271
  • 我被黑心中介騙來(lái)泰國(guó)打工捌臊, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人兜材。 一個(gè)月前我還...
    沈念sama閱讀 48,308評(píng)論 3 372
  • 正文 我出身青樓理澎,卻偏偏與公主長(zhǎng)得像,于是被迫代替她去往敵國(guó)和親曙寡。 傳聞我的和親對(duì)象是個(gè)殘疾皇子糠爬,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 45,037評(píng)論 2 355

推薦閱讀更多精彩內(nèi)容

  • 開(kāi)學(xué)還不滿兩周,心里一直亂糟糟的举庶;突然面臨諸多選擇执隧,有點(diǎn)手足無(wú)措。 大抵都是些關(guān)于保研的事情户侥,是選擇哪一個(gè)老師和方...
    WH_7cfb閱讀 191評(píng)論 0 0
  • “嚴(yán)而不厲蕊唐,親而不昵”屋摔,王蓉老師首先給我們提出了這八個(gè)字,這是她班主任經(jīng)歷中的精華替梨。 王蓉老師語(yǔ)速很快钓试,稍不注...
    用香煙點(diǎn)燃生命的激情閱讀 1,301評(píng)論 0 3
  • 文丨趙小冊(cè) 圖丨網(wǎng)絡(luò) 01 《紅樓夢(mèng)》是一部很神奇的書(shū)装黑。十幾歲時(shí)第一次看《紅樓夢(mèng)》,看到的都是寶黛釵的愛(ài)情糾葛弓熏,還...
    趙小冊(cè)閱讀 571評(píng)論 0 3
  • 歲月不會(huì)因?yàn)槟愫ε吕先チ堤罚o止在那里; 靜止在那里的是一顆年輕的心挽鞠。 愛(ài)情不會(huì)因?yàn)槟悴辉I(lǐng)悟箕别,而靜止在那里; 靜止...
    汪凌眉閱讀 242評(píng)論 0 2