class: center, middle, inverse, title-slide # A Difference-in-Differences Approach
using Mixed-Integer Programming Matching ## Magdalena Bennett
McCombs School of Business, UT Austin ### ASA Austin Chapter Meeting
June 09, 2021 --- <style type="text/css"> .small .remark-code { /*Change made here*/ font-size: 80% !important; } .tiny .remark-code { /*Change made here*/ font-size: 90% !important; } </style> # Diff-in-Diff as an identification strategy <img src="mbennett_did_files/figure-html/dd-1.svg" style="display: block; margin: auto;" /> --- # Diff-in-Diff as an identification strategy <img src="mbennett_did_files/figure-html/dd2-1.svg" style="display: block; margin: auto;" /> --- # Diff-in-Diff as an identification strategy <img src="mbennett_did_files/figure-html/dd3-1.svg" style="display: block; margin: auto;" /> --- # Very popular for policy evaluation <img src="mbennett_did_files/figure-html/gg-1.svg" style="display: block; margin: auto;" /> .source[Source: Google Scholar] --- # What about parallel trends? .pull-left[ .center[ ![:scale 80%](https://raw.githubusercontent.com/maibennett/website_github/master/exampleSite/content/images/data_comic.jpg)] ] .pull-right[] --- # What about parallel trends? .pull-left[ .center[ ![:scale 80%](https://raw.githubusercontent.com/maibennett/website_github/master/exampleSite/content/images/data_comic.jpg)] ] .pull-right[ - Bounds on treatment effects (Rambachan & Roth, 2020)] --- # What about parallel trends? .pull-left[ .center[ ![:scale 80%](https://raw.githubusercontent.com/maibennett/website_github/master/exampleSite/content/images/data_comic.jpg)] ] .pull-right[ - Bounds on treatment effects (Rambachan & Roth, 2020) - Find sub-groups that potentially follow PTA (e.g. similar units in treatment and control) - Similar to synthetic control intuition.] --- # What about parallel trends? .pull-left[ .center[ ![:scale 80%](https://raw.githubusercontent.com/maibennett/website_github/master/exampleSite/content/images/data_comic.jpg)] ] .pull-right[ - Bounds on treatment effects (Rambachan & Roth, 2020) - Find sub-groups that potentially follow PTA (e.g. similar units in treatment and control) - Similar to synthetic control intuition. - Can matching help? - It's complicated (?) .small[(Zeldow & Hatfield, 2019; Lindner & McConnell, 2018; Daw & Hatfield, 2018 (x2); Ryan, 2018; Ryan et al., 2018)] ] --- # What about parallel trends? .pull-left[ .center[ ![:scale 80%](https://raw.githubusercontent.com/maibennett/website_github/master/exampleSite/content/images/data_comic.jpg)] ] .pull-right[ - Bounds on treatment effects (Rambachan & Roth, 2020) - Find sub-groups that potentially follow PTA (e.g. similar units in treatment and control) - Similar to synthetic control intuition. - Can matching help? - It's complicated (?) .small[(Zeldow & Hatfield, 2019; Lindner & McConnell, 2018; Daw & Hatfield, 2018 (x2); Ryan, 2018; Ryan et al., 2018)] ] --- # This paper - Identify contexts when matching can recover causal estimates under **.darkorange[violations in the parallel trend assumption]**. - Partial identification in some cases. - Use **.darkorange[mixed-integer programming matching (MIP)]** to balance covariates directly. -- <br/> .pull-left[ .box-3.medium.sp-after-half[Simulations:<br/>Different DGP scenarios] ] .pull-right[ .box-7.medium.sp-after-half[Application:<br/>School segregation & vouchers] ] --- background-position: 50% 50% class: left, bottom, inverse .big[ Let's get started ] --- # DD Setup - Let `\(Y_{it}(z)\)` be the potential outcome for unit `\(i\)` in period `\(t\)` under treatment `\(z\)`. - Intervention implemented in `\(T_0\)` `\(\rightarrow\)` No units are treated in `\(t\leq T_0\)` -- - Difference-in-Differences (DD) focuses on ATT for `\(t>T_0\)`: `$$ATT = E[Y_{it}(1) - Y_{it}(0)|Z=1]$$` -- - **.darkorange[Assumptions for DD]**: - Parallel-trend assumption (PTA) - Common shocks `$$E[Y_{i1}(0) - Y_{i0}(0) | Z=1] = E[Y_{i1}(0) - Y_{i0}(0) | Z=0]$$` --- # DD Setup (cont.) - Under these assumptions: $$ `\begin{align} \hat{\tau}^{DD} = &\color{#900DA4}{\overbrace{\color{black}{E[Y(1)|Z=1] - E[Y(1)|Z=0]}}^{\color{#900DA4}{\Delta_{post}}}} - \\ &\color{#F89441}{\underbrace{\color{black}{(E[Y(0)|Z=1] - E[Y(0)|Z=0])}}_{\color{#F89441}{\Delta_{pre}}}} \end{align}` $$ - Where `\(t=0\)` and `\(t=1\)` are the pre- and post-intervention periods, respectively. - `\(Y(t) = Y^1(t)\cdot Z + (1-Z)\cdot Y^0(t)\)` is the observed outcome. --- # Violations to the PTA .pull-left[ - Under PTA, `\(g_1(t) = g_0(t) + h(t)\)`, where: - `\(g_z(t) = E[Y_{it}(0) | Z=z, T=t]\)` - `\(h(t) = \alpha\)` ] .pull-right[ ![](https://media.giphy.com/media/L8yQ0RQBItqso/giphy.gif) ] --- # Violations to the PTA .pull-left[ - Under PTA, `\(g_1(t) = g_0(t) + h(t)\)`, where: - `\(g_z(t) = E[Y_{it}(0) | Z=z, T=t]\)` - `\(h(t) = \alpha\)` - Bias in a DD setting depends on the structure of `\(h(t)\)`. - Confounding in DD affect **.darkorange[trends]** and not **.darkorange[levels]**. ] .pull-right[ ![](https://media.giphy.com/media/L8yQ0RQBItqso/giphy.gif) ] --- # Violations to the PTA .pull-left[ - Under PTA, `\(g_1(t) = g_0(t) + h(t)\)`, where: - `\(g_z(t) = E[Y_{it}(0) | Z=z, T=t]\)` - `\(h(t) = \alpha\)` - Bias in a DD setting depends on the structure of `\(h(t)\)`. - Confounding in DD affect **.darkorange[trends]** and not **.darkorange[levels]**. - Contextual knowledge is important! ] .pull-right[ ![](https://media.giphy.com/media/L8yQ0RQBItqso/giphy.gif) ] --- # Two distinct problems when combining matching + DD .pull-left[ - **.darkorange[Bias when matching on time-varying covariates]**: - Depends on the structure of time variation - **.darkorange[Regression to the mean]**: - Both groups come from different populations - Particularly salient when matching on previous outcomes and small number of pre-periods. ] .pull-right[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/DD/bb_20201202/images/reg_to_the_mean.svg" alt="diagram" width="500"/> ] --- # How do we match? - Match covariates or outcomes? Levels or trends? - Propensity score matching? Optimal matching? etc. -- This paper: - **.darkorange[Match on covariates]** that could make groups behave differently. - Use of **.darkorange[Mixed-Integer Programming (MIP) Matching]** .small[(Zubizarreta, 2015; Bennett, Zubizarreta, & Vielma, 2020)]: .small[ - Balance covariates directly - Yield largest matched sample under balancing constraints (cardinality matching) - Works with large samples ] --- background-position: 50% 50% class: left, bottom, inverse .big[ Simulations ] --- # Different scenarios .pull-left[ **Time-invariant covariates:** .box-1.medium.sp-after-half[S1: Time-invariant covariate effect] .box-2.medium.sp-after-half[S2: Time-varying covariate effect] .box-3.medium.sp-after-half[S3: Treatment-independent covariate] ] -- .pull-right[ **Time-varying covariates:** .box-4.medium.sp-after-half[S4: Parallel evolution] .box-6.medium.sp-after-half[S5: Evolution differs by group] .box-7.medium.sp-after-half[S6: Evolution diverges in post] ] <br> <br> .source[Following Zeldow & Hatfield (2019)] --- # Time-invariant covariates `$$X_i \stackrel{ind}{\sim} N(m(z_i),v(z_i))$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$` -- <br> <br> .box-1b[S1) Time-invariant covariate effect: g(x<sub>i</sub>,t) = 0] .box-2b[S2) Time-varying covariate effect: g(x<sub>i</sub>,t) ≠ 0] .box-3b[S3) Time-varying covariate effect: m(z<sub>i</sub>) = μ and v(z<sub>i</sub>) = σ] --- # Time-varying covariates `$$X_{it} = x_{(t-1)i} + h(z_i,t)\cdot r_i + m(z_i,t)$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$` -- <br> <br> .box-4b[S4) Parallel evolution: h(z<sub>i</sub>,t) = h(t) and m(z<sub>i</sub>,t) = 0] .box-6b[S5) Evolution differs by group: m(z<sub>i</sub>,t) = 0] .box-7b[S6) Evolution differs in post: h(z<sub>i</sub>,t) = h(t) and m(z<sub>i</sub>,t) = Post*m(z<sub>i</sub>,t)] --- # Different ways to control <div class="center"><table> <thead> <tr> <th>Model</th> <th>Pseudo <code class="remark-inline-code">R</code> code</th> </tr> </thead> <tbody> <tr> <td>Simple</td> <td><code class="remark-inline-code">lm(y ~ a*p + t)</code> </td> </tr> <tr> <td>Covariate Adjusted (CA)</td> <td><code class="remark-inline-code">lm(y ~ a*p + t + x)</code> </td> </tr> <tr> <td>Time-Varying Adjusted (TVA)</td> <td><code class="remark-inline-code">lm(y ~ a*p + t*x)</code> </td> </tr> <tr> <td>Match on pre-treat outcomes</td> <td><code class="remark-inline-code">lm(y ~ a*p + t, data=out.match)</code> </td> </tr> <tr> <td>Match on pre-treat 1st diff</td> <td><code class="remark-inline-code">lm(y ~ a*p + t, data=out.lag.match)</code> </td> </tr> <tr> <td>Match on pre-treat cov (PS)</td> <td><code class="remark-inline-code">lm(y ~ a*p + t, data=cov.match)</code> </td> </tr> <tr> <td id="highlight">Match on pre-treat cov (MIP)</td> <td id="highlight"><code class="remark-inline-code">Event study (data=cov.match.mip)</code></td> </tr> <tr> <td id="highlight">Match on all cov (MIP)</td> <td id="highlight"><code class="remark-inline-code">Event study (data=cov.match.mip.all)</code></td> </tr> </tbody> </table> </div> .bottom[ .source[Following Zeldow & Hatfield (2019)]] --- #Parameters: .center[ Parameter | Value -------------------------------------|---------------------------------------------- Number of obs (N) | 1,000 `Pr(Z=1)` | 0.5 Time periods (T) | 10 Last pre-intervention period (T_0) | 5 Matching PS | Nearest neighbor MIP Matching tolerance | .05 SD Number of simulations | 1,000 ] - Estimate compared to sample ATT (_different for matching_) - When matching with post-treat covariates `\(\rightarrow\)` compared with direct effect `\(\tau\)` --- #Results: Time-constant covariates <img src="mbennett_did_files/figure-html/res1-1.svg" style="display: block; margin: auto;" /> --- # Results: Time-varying covariates <img src="mbennett_did_files/figure-html/res2-1.svg" style="display: block; margin: auto;" /> --- # Results: Time-varying covariates - In these simulations. for time-varying covariates: - Matching on treatment covariates returns a unbiased ATT estimate **.darkorange[if covariates evolve differently over time and treatment does not affect them]**. -- - Matching on treatment covariates returns a biased ATT estimate **.darkorange[if covariates evolve differently over time and are affected by treatment]**. -- .box-5[We don't know in which scenario we are] -- - Matching on pre- and post-intervention covariates returns the **.darkorange[direct effect of the treatment on the outcome]** - Depending on the context, this could be an **.darkorange[upper or lower bound]** for the true effect. --- # Other simulations - Test **.darkorange[regression to the mean]** under no effect: - Vary autocorrelation of `\(X_i(t)\)` (low vs. high) - `\(X_0(t)\)` and `\(X_1(t)\)` come from the same or different distribution. <img src="mbennett_did_files/figure-html/res3-1.svg" style="display: block; margin: auto;" /> --- background-position: 50% 50% class: left, bottom, inverse .big[ Application ] --- #Preferential Voucher Scheme in Chile - Universal **.darkorange[flat voucher]** scheme `\(\stackrel{\mathbf{2008}}{\mathbf{\longrightarrow}}\)` Universal + **.darkorange[preferential voucher]** scheme - Preferential voucher scheme: - Targeted to bottom 40% of vulnerable students - Additional 50% of voucher per student - Additional money for concentration of SEP students. -- <br/> .pull-left[ .box-3b.medium.sp-after-half[Students:<br/>- Verify SEP status<br/>- Attend a SEP school] ] .pull-right[ .box-6b.medium.sp-after-half[Schools:<br/>- Opt-into the policy<br/>- No selection, no fees<br/>- Resources ~ performance] ] --- #Impact of the SEP policy - **.darkorange[Positive impact on test scores]** for lower-income students (Aguirre, 2019; Nielson, 2016) - Design could have **.darkorange[increased]** socioeconomic segregation - Incentives for concentration of SEP students - Key decision variables for schools: Performance, current SEP students, competition, add-on fees. - **.darkorange[Diff-in-diff (w.r.t. 2007) for SEP and non-SEP schools]**: - Only for **.darkorange[private-subsidized schools]** - Matching between 2005-2007 --> Effect estimated for 2008-2011 - Outcome: Average students' household income --- #Before Matching .pull-left[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/DD/bb_20201202/images/pta_all.svg" alt="diagram" width="800"/> ] .pull-right[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/DD/bb_20201202/images/dd_all.svg" alt="diagram" width="800"/> ] --- # Matching + DD - **.darkorange[Prior to matching]**: No parallel pre-trend, covariates evolve differently for both groups. - **.darkorange[Different types of schools]**: - Schools that charge high co-payment fees. - Schools with low number of SEP student enrolled. - **.darkorange[MIP Matching]** using constant or "sticky" covariates: - Mean balance (0.05 SD): Rural, enrollment, number of schools in county, charges add-on fees - Fine balance: Test scores, monthly average voucher. --- # After matching .pull-left[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/DD/bb_20201202/images/pta_match.svg" alt="diagram" width="800"/> ] .pull-right[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/DD/bb_20201202/images/dd_match.svg" alt="diagram" width="800"/> ] --- #Results - **.darkorange[Matched schools]**: - More vulnerable and lower test scores than the population mean. - **.darkorange[6% increase in the income gap]** between SEP and non-SEP schools in matched DD: - SEP schools attracted even more vulnerable students. - Non-SEP schools increased their average family income. -- - There is a need to **.darkorange[evaluate the policy as a whole]**. - Unintended consequences also matter. --- background-position: 50% 50% class: left, bottom, inverse .big[ Let's wrap it up ] --- #Conclusions .pull-left[ - **.darkorange[Matching can be an important tool to address violations in PTA]**. - Relevant to think whether groups come from the **.darkorange[same]** or **.darkorange[different]** populations. - **.darkorange[Serial correlation]** also plays an important role: Don't match on random noise.] .pull-right[ ![](https://media.giphy.com/media/drwxYI2fxqQGqRZ9Pe/giphy.gif) ] --- #Conclusions .pull-left[ - **.darkorange[Matching can be an important tool to address violations in PTA]**. - Relevant to think whether groups come from the **.darkorange[same]** or **.darkorange[different]** populations. - **.darkorange[Serial correlation]** also plays an important role: Don't match on random noise. .box-7.medium.sp-after-half[Match well and match smart! ]] .pull-right[ ![](https://media.giphy.com/media/drwxYI2fxqQGqRZ9Pe/giphy.gif) ] --- background-position: 50% 50% class: center,middle, inverse #A Difference-in-Differences Approach<br/>using Mixed-Integer Programming Matching ##Magdalena Bennett --- # Time-invariant Covariates .box-1a.medium.sp-after-half[S1: Time-invariant covariate effect] .small[ `$$X_i \stackrel{ind}{\sim} N(m(z_i),v(z_i))$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t),1)$$`] -- .box-2a.medium.sp-after-half[S2: Time-varying covariate effect] .small[ `$$X_i \stackrel{ind}{\sim} N(m(z_i),v(z_i))$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$`] -- .box-3a.medium.sp-after-half[S3: Treatment-independent covariate] .small[ `$$X_i \stackrel{ind}{\sim} N(1,1)$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$`] --- # Time-varying Covariates .box-4a.medium.sp-after-half[S4: Parallel evolution] .small[ `$$X_{it} = x_{(t-1)i} + m_1(t)\cdot z$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$`] -- .box-6a.medium.sp-after-half[S5: Evolution differs by group] .small[ `$$X_{it} = x_{(t-1)i} + m_2(z_i,t)\cdot z$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$`] -- .box-7a.medium.sp-after-half[S6: Evolution diverges in post] .small[ `$$X_{it} = x_{(t-1)i} + m_1(t)\cdot z - m_3(z_i,t)$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$`] --- # Covariate evolution: Time-invariant .center[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/DD/bb_20201202/images/cov_1.svg" alt="diagram" width="900"/>] --- # Covariate evolution: Time-varying .center[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/DD/bb_20201202/images/cov_2.svg" alt="diagram" width="900"/>]