class: center, middle, inverse, title-slide # A Difference-in-Differences Approach
using Mixed-Integer Programming Matching ## Magdalena Bennett ### SDS Seminar Series, UT Austin
Oct 16, 2020 --- # Diff-in-Diff as an identification strategy <img src="mbennett_did_files/figure-html/dd-1.svg" style="display: block; margin: auto;" /> --- # Diff-in-Diff as an identification strategy <img src="mbennett_did_files/figure-html/dd2-1.svg" style="display: block; margin: auto;" /> --- # Diff-in-Diff as an identification strategy <img src="mbennett_did_files/figure-html/dd3-1.svg" style="display: block; margin: auto;" /> --- # Very popular for policy evaluation <img src="mbennett_did_files/figure-html/gg-1.svg" style="display: block; margin: auto;" /> .source[Source: Google Scholar] --- # What about parallel trends? .pull-left[ ![](https://raw.githubusercontent.com/maibennett/website_github/master/exampleSite/content/images/data_comic.jpg)] .pull-right[ - Can matching work to solve this? - It's complicated (?) .small[(Zeldow & Hatfield, 2019;Lindner & McConnell, 2018; Daw & Hatfield, 2018 (x2); Ryan, 2018; Ryan et al., 2018)] - Most work has focused on **.purple[matching outcomes]**] --- # This paper - Identify contexts when matching can recover causal estimates under **violations in the parallel trend assumption**. - Use **mixed-integer programming matching (MIP)** to balance covariates directly. - Matching for **panel** and **repeated cross-sectional** data. -- <br/> .pull-left[ .box-3.medium.sp-after-half[Simulations:<br/>Different DGP scenarios] ] .pull-right[ .box-6.medium.sp-after-half[Application:<br/>School segregation & vouchers] ] --- background-position: 50% 50% class: left, bottom, inverse .big[ Let's get started ] --- # DD Setup - Let `\(Y^z_i(t)\)` be the potential outcome for unit `\(i\)` in period `\(t\)` under treatment `\(z\)`. - Intervention implemented in `\(T_0\)` `\(\rightarrow\)` No units are treated in `\(t\leq T_0\)` - Difference-in-Differences (DD) focuses on ATT for `\(t>T_0\)`: `$$ATT = E[Y_i^1(t) - Y_i^0(t)|Z=1]$$` - **.purple[Assumptions for DD]**: - Parallel-trend assumption (PTA) - Common shocks `$$E[Y_i^0(1) - Y_i^0(0) | Z=1] = E[Y_i^0(1) - Y_i^0(0) | Z=0]$$` --- # DD Setup (cont.) - Under these assumptions: $$ `\begin{align} \hat{\tau}^{DD} = &\color{#900DA4}{\overbrace{\color{black}{E[Y(1)|Z=1] - E[Y(1)|Z=0]}}^{\color{#900DA4}{\Delta_{post}}}} - \\ &\color{#F89441}{\underbrace{\color{black}{(E[Y(0)|Z=1] - E[Y(0)|Z=0])}}_{\color{#F89441}{\Delta_{pre}}}} \end{align}` $$ - Where `\(t=0\)` and `\(t=1\)` are the pre- and post-intervention periods, respectively. - `\(Y(t) = Y^1(t)\cdot Z + (1-Z)\cdot Y^0(t)\)` is the observed outcome. --- # Violations to the PTA .pull-left[ - Under PTA, `\(g_1(t) = g_0(t) + h(t)\)`, where: - `\(g_z(t) = E[Y^0_i(t) | Z=z, T=t]\)` - `\(h(t) = \alpha\)` - Bias in a DD setting depends on the structure of `\(h(t)\)`. - Confounding in DD affect **.purple[trends]** and not **.purple[levels]**. - Contextual knowledge is important! - Do groups come from different populations? ] .pull-right[ ![](https://media.giphy.com/media/L8yQ0RQBItqso/giphy.gif) ] --- # How do we match? - Match covariates or outcomes? Levels or trends? - Use of **.purple[MIP Matching]** .small[(Zubizarreta, 2015; Bennett, Zubizarreta, & Vielma, 2020)]: - Balance covariates directly - Yield largest matched sample under balancing constraints - Use of template matching to match multiple groups - Works with large samples --- # Panel or repeated cross-sections? - **Panel data:** Straightforward - **Repeated cross-section data:** Representative template matching .center[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/sds_20201016/images/diagram3v2.svg" alt="diagram" width="600"/>] --- background-position: 50% 50% class: left, bottom, inverse .big[ Simulations ] --- # Different scenarios .box-1.medium.sp-after-half[S1: Time-invariant covariate effect] .box-2.medium.sp-after-half[S2: Time-varying covariate effect] .box-3.medium.sp-after-half[S3: Treatment-independent covariate] .box-4.medium.sp-after-half[S4: Parallel evolution] .box-6.medium.sp-after-half[S5: Evolution differs by group] .box-7.medium.sp-after-half[S6: Evolution diverges in post] .source[Following Zeldow & Hatfield (2019)] --- # Different ways to control <div class="center"><table> <thead> <tr> <th>Model</th> <th>Pseudo <code class="remark-inline-code">R</code> code</th> </tr> </thead> <tbody> <tr> <td>Simple</td> <td><code class="remark-inline-code">lm(y ~ a*p + t)</code> </td> </tr> <tr> <td>Covariate Adjusted (CA)</td> <td><code class="remark-inline-code">lm(y ~ a*p + t + x)</code> </td> </tr> <tr> <td>Time-Varying Adjusted (TVA)</td> <td><code class="remark-inline-code">lm(y ~ a*p + t*x)</code> </td> </tr> <tr> <td>Match on pre-treat outcomes</td> <td><code class="remark-inline-code">lm(y ~ a*p + t, data=out.match)</code> </td> </tr> <tr> <td>Match on pre-treat 1st diff</td> <td><code class="remark-inline-code">lm(y ~ a*p + t, data=out.lag.match)</code> </td> </tr> <tr> <td>Match on pre-treat cov (PS)</td> <td><code class="remark-inline-code">lm(y ~ a*p + t, data=cov.match)</code> </td> </tr> <tr> <td id="highlight">Match on pre-treat cov (MIP)</td> <td id="highlight"><code class="remark-inline-code">Event study (data=cov.match.mip)</code></td> </tr> <tr> <td id="highlight">Match on all cov (MIP)</td> <td id="highlight"><code class="remark-inline-code">Event study (data=cov.match.mip.all)</code></td> </tr> </tbody> </table> </div> .bottom[ .source[Following Zeldow & Hatfield (2019)]] --- # Time-invariant Covariates .box-1a.medium.sp-after-half[S1: Time-invariant covariate effect] .small[ `$$X_i \stackrel{ind}{\sim} N(m(z_i),v(z_i))$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t),1)$$`] -- .box-2a.medium.sp-after-half[S2: Time-varying covariate effect] .small[ `$$X_i \stackrel{ind}{\sim} N(m(z_i),v(z_i))$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$`] -- .box-3a.medium.sp-after-half[S3: Treatment-independent covariate] .small[ `$$X_i \stackrel{ind}{\sim} N(1,1)$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$`] --- # Time-varying Covariates .box-4a.medium.sp-after-half[S4: Parallel evolution] .small[ `$$X_{it} = x_{(t-1)i} + m_1(t)\cdot z$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$`] -- .box-6a.medium.sp-after-half[S5: Evolution differs by group] .small[ `$$X_{it} = x_{(t-1)i} + m_2(z_i,t)\cdot z$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$`] -- .box-7a.medium.sp-after-half[S6: Evolution diverges in post] .small[ `$$X_{it} = x_{(t-1)i} + m_1(t)\cdot z - m_3(z_i,t)$$` `$$Y_i(t) \stackrel{ind}{\sim} N(1+z_i+treat_{it}+u_i+x_i+f(t)+g(x_i,t),1)$$`] --- # Covariate evolution: Time-invariant .center[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/sds_20201016/images/cov_1.svg" alt="diagram" width="900"/>] --- # Covariate evolution: Time-varying .center[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/sds_20201016/images/cov_2.svg" alt="diagram" width="900"/>] --- #Parameters: .center[ Parameter | Value -------------------------------------|---------------------------------------------- Number of obs (N) | 1,000 `Pr(Z=1)` | 0.5 Time periods (T) | 10 Last pre-intervention period (T_0) | 5 Matching PS | Nearest neighbor MIP Matching tolerance | .05 SD Number of simulations | 1,000 ] - Estimate compared to sample ATT (_different for matching_) - When matching with post-treat covariates `\(\rightarrow\)` compared with direct effect `\(\tau\)` --- #Results: Time-constant effects <img src="mbennett_did_files/figure-html/res1-1.svg" style="display: block; margin: auto;" /> --- # Results: Time-varying effects <img src="mbennett_did_files/figure-html/res2-1.svg" style="display: block; margin: auto;" /> --- # Other simulations - Test **.purple[regression to the mean]** under no effect: - Vary autocorrelation of `\(X_i(t)\)` (low vs. high) - `\(X_0(t)\)` and `\(X_1(t)\)` come from the same or different distribution. <img src="mbennett_did_files/figure-html/res3-1.svg" style="display: block; margin: auto;" /> --- background-position: 50% 50% class: left, bottom, inverse .big[ Application ] --- #Preferential Voucher Scheme in Chile - Universal **flat voucher** scheme `\(\stackrel{\mathbf{2008}}{\mathbf{\longrightarrow}}\)` Universal + **preferential voucher** scheme - Preferential voucher scheme: - Targeted to bottom 40% of vulnerable students - Additional 50% of voucher per student - Additional money for concentration of SEP students. -- <br/> .pull-left[ .box-3b.medium.sp-after-half[Students:<br/>- Verify SEP status<br/>- Attend a SEP school] ] .pull-right[ .box-6b.medium.sp-after-half[Schools:<br/>- Opt-into the policy<br/>- No selection, no fees<br/>- Resources ~ performance] ] --- #Impact of the SEP policy - **Positive impact on test scores** for lower-income students (Aguirre, 2019; Nielson, 2016) - Design could have **increased** socioeconomic segregation - Incentives for concentration of SEP students - Key decision variables: Performance, current SEP students, competition, add-on fees. - **Diff-in-diff (w.r.t. 2007) for SEP and non-SEP schools**: - Only for **.purple[private-subsidized schools]** - Matching between 2005-2007 --> Effect estimated for 2008-2011 - Outcome: Average students' household income --- #Before Matching .pull-left[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/sds_20201016/images/dd_all.svg" alt="diagram" width="800"/> ] .pull-right[ - No (pre) parallel trend - Covariates evolve differently in the pre-intervention period ] --- # [Pre] parallel trends .pull-left[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/sds_20201016/images/pta_all.svg" alt="diagram" width="800"/> ] .pull-right[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/sds_20201016/images/pta_match.svg" alt="diagram" width="800"/> ] --- #After Matching .pull-left[ <img src="https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/sds_20201016/images/dd_match.svg" alt="diagram" width="800"/> ] .pull-right[ - **MIP Matching**: - Mean balance (0.05 SD): Rural, enrollment, number of schools in county, charges add-on fees - Fine balance: Test scores, monthly average voucher. - **6% increase in the income gap** between SEP and non-SEP schools ] --- background-position: 50% 50% class: left, bottom, inverse .big[ Let's wrap it up ] --- #Conclusions .pull-left[ - **Matching can be an important tool to address violations in PTA**. - Relevant to think whether groups come from the **.purple[same]** or **.purple[different]** populations. - **Serial correlation** also plays an important role: Don't match on random noise. - Adopt **flexibility** when estimating effects (event study) .box-7.medium.sp-after-half[Match well and match smart! ]] .pull-right[ ![](https://media.giphy.com/media/drwxYI2fxqQGqRZ9Pe/giphy.gif) ] --- background-position: 50% 50% class: center,middle, inverse #A Difference-in-Differences Approach<br/>using Mixed-Integer Programming Matching ##Magdalena Bennett