Source: Google Scholar
Can matching work to solve this?
Most work has focused on matching outcomes
Identify contexts when matching can recover causal estimates under violations in the parallel trend assumption.
Use mixed-integer programming matching (MIP) to balance covariates directly.
Matching for panel and repeated cross-sectional data.
Identify contexts when matching can recover causal estimates under violations in the parallel trend assumption.
Use mixed-integer programming matching (MIP) to balance covariates directly.
Matching for panel and repeated cross-sectional data.
Simulations:
Different DGP scenarios
Application:
School segregation & vouchers
Let's get started
Let Yzi(t) be the potential outcome for unit i in period t under treatment z.
Intervention implemented in T0 → No units are treated in t≤T0
Difference-in-Differences (DD) focuses on ATT for t>T0: ATT=E[Y1i(t)−Y0i(t)|Z=1]
Assumptions for DD:
Parallel-trend assumption (PTA)
Common shocks
E[Y0i(1)−Y0i(0)|Z=1]=E[Y0i(1)−Y0i(0)|Z=0]
Under these assumptions: ^τDD=ΔpostE[Y(1)|Z=1]−E[Y(1)|Z=0]−(E[Y(0)|Z=1]−E[Y(0)|Z=0])Δpre
Where t=0 and t=1 are the pre- and post-intervention periods, respectively.
Y(t)=Y1(t)⋅Z+(1−Z)⋅Y0(t) is the observed outcome.
Under PTA, g1(t)=g0(t)+h(t), where:
Bias in a DD setting depends on the structure of h(t).
Confounding in DD affect trends and not levels.
Contextual knowledge is important!
Match covariates or outcomes? Levels or trends?
Use of MIP Matching (Zubizarreta, 2015; Bennett, Zubizarreta, & Vielma, 2020):
Balance covariates directly
Yield largest matched sample under balancing constraints
Use of template matching to match multiple groups
Works with large samples
Panel data: Straightforward
Repeated cross-section data: Representative template matching
Simulations
S1: Time-invariant covariate effect
S2: Time-varying covariate effect
S3: Treatment-independent covariate
S4: Parallel evolution
S5: Evolution differs by group
S6: Evolution diverges in post
Following Zeldow & Hatfield (2019)
Model | Pseudo R code |
---|---|
Simple | lm(y ~ a*p + t) |
Covariate Adjusted (CA) | lm(y ~ a*p + t + x) |
Time-Varying Adjusted (TVA) | lm(y ~ a*p + t*x) |
Match on pre-treat outcomes | lm(y ~ a*p + t, data=out.match) |
Match on pre-treat 1st diff | lm(y ~ a*p + t, data=out.lag.match) |
Match on pre-treat cov (PS) | lm(y ~ a*p + t, data=cov.match) |
Match on pre-treat cov (MIP) | Event study (data=cov.match.mip) |
Match on all cov (MIP) | Event study (data=cov.match.mip.all) |
Following Zeldow & Hatfield (2019)
S1: Time-invariant covariate effect
Xiind∼N(m(zi),v(zi)) Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t),1)
S1: Time-invariant covariate effect
Xiind∼N(m(zi),v(zi)) Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t),1)
Xiind∼N(m(zi),v(zi)) Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t)+g(xi,t),1)
S1: Time-invariant covariate effect
Xiind∼N(m(zi),v(zi)) Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t),1)
Xiind∼N(m(zi),v(zi)) Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t)+g(xi,t),1)
Xiind∼N(1,1) Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t)+g(xi,t),1)
S4: Parallel evolution
Xit=x(t−1)i+m1(t)⋅z Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t)+g(xi,t),1)
S4: Parallel evolution
Xit=x(t−1)i+m1(t)⋅z Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t)+g(xi,t),1)
Xit=x(t−1)i+m2(zi,t)⋅z Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t)+g(xi,t),1)
S4: Parallel evolution
Xit=x(t−1)i+m1(t)⋅z Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t)+g(xi,t),1)
Xit=x(t−1)i+m2(zi,t)⋅z Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t)+g(xi,t),1)
Xit=x(t−1)i+m1(t)⋅z−m3(zi,t) Yi(t)ind∼N(1+zi+treatit+ui+xi+f(t)+g(xi,t),1)
Parameter | Value |
---|---|
Number of obs (N) | 1,000 |
Pr(Z=1) |
0.5 |
Time periods (T) | 10 |
Last pre-intervention period (T_0) | 5 |
Matching PS | Nearest neighbor |
MIP Matching tolerance | .05 SD |
Number of simulations | 1,000 |
Test regression to the mean under no effect:
Application
Universal flat voucher scheme 2008⟶ Universal + preferential voucher scheme
Preferential voucher scheme:
Targeted to bottom 40% of vulnerable students
Additional 50% of voucher per student
Additional money for concentration of SEP students.
Universal flat voucher scheme 2008⟶ Universal + preferential voucher scheme
Preferential voucher scheme:
Targeted to bottom 40% of vulnerable students
Additional 50% of voucher per student
Additional money for concentration of SEP students.
Students:
- Verify SEP status
- Attend a SEP school
Schools:
- Opt-into the policy
- No selection, no fees
- Resources ~ performance
Positive impact on test scores for lower-income students (Aguirre, 2019; Nielson, 2016)
Design could have increased socioeconomic segregation
Key decision variables: Performance, current SEP students, competition, add-on fees.
Diff-in-diff (w.r.t. 2007) for SEP and non-SEP schools:
Only for private-subsidized schools
Matching between 2005-2007 --> Effect estimated for 2008-2011
Outcome: Average students' household income
No (pre) parallel trend
Covariates evolve differently in the pre-intervention period
MIP Matching:
Mean balance (0.05 SD): Rural, enrollment, number of schools in county, charges add-on fees
Fine balance: Test scores, monthly average voucher.
6% increase in the income gap between SEP and non-SEP schools
Let's wrap it up
Matching can be an important tool to address violations in PTA.
Relevant to think whether groups come from the same or different populations.
Serial correlation also plays an important role: Don't match on random noise.
Adopt flexibility when estimating effects (event study)
Match well and match smart!
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |