How far is too far? Generalization of a regression discontinuity design away from the cutoff

class: center, middle, inverse, title-slide

# How far is too far? <br/>Generalization of a regression discontinuity design away from the cutoff
## Magdalena Bennett<br/>The University of Texas at Austin
### UC San Diego Econometrics Seminar<br/>Apr 20, 2021

---

# Regression discontinuity design

---
# Very popular design for causal inference

---
# Strong internal validity

---
# ... But limited external validity

---
# Missing data and no overlap

---
# Can we find a generalization interval?

---
# This paper

**Identification of generalization interval and estimation of ATT for population within such interval:**
--

- Pre-intervention period informs the generalization bandwidth <br>
  .tiny[(Wing & Cook, 2013; Keele, Small, Hsu, & Fogarty, 2020)]
    
  - Leverage the use of predictive covariates for breaking link between running variable and outcome <br>
  .tiny[(Angrist & Rokkanen, 2015; Rokkanen, 2015; Keele, Titiunik, & Zubizarreta, 2015)]
  
  - Based on idea of local randomization near the cutoff <br>
  .tiny[(Lee, 2008; Cattaneo, Frandsen, & Titiunik, 2015)]
  
---
# This paper

**Main advantages:**
--

- Gradual approach
  - No need for "All or Nothing"
  - Interval informed by the data .tinylist[(Cattaneo et al., 2015)]
--

- No extrapolation of population characteristics
  - Compare like-to-like .tinylist[(Rosenbaum, 1987)]
  - Makes overlap region explicit
--

- Generalization to population of interest
  - Use of representative template matching .tinylist[(Silber et al., 2014; Bennett, Vielma, & Zubizarreta, 2020)]
--

- Sensitivity analysis to hidden bias .tiny[(Rosenbaum, 2010; Keele et al., 2020)]

---
# Outline

1. Motivation
<br>
<br>
2. Generalized Regression Discontinuity Design (GRD)

2.1 Framework
  
  2.2 GRD in practice
<br>
<br>
3. Application: Free Higher Education in Chile
<br>
<br>
4. Conclusions

---

background-position: 50% 50%
class: left,middle, inverse, nonum
.big[
Generalized Regression Discontinuity Design
]

---
# Generalized Regression Discontinuity Design (GRD)

--
Two-part problem with pre- and post-intervention periods:
<br>

--
<br>
<span class="box-5">1) Identification of generalization interval H<sup>\*</sup></span>

.box-5t[(using pre-intervention period)]

--
<br>
<span class="box-5">2) Estimation of ATT for population within H<sup>\*</sup></span>

.box-5t[(using post-intervention period)]

---
#The setup

- **Two periods**: pre- and post-intervention, `$t=0$` and `$t=1$`.

- Running variable `$R$` determines assignment `$Z$` in `$t=1$`. E.g.:
`$$Z_{it} = \mathrm{I}(R_{it}<c)$$`

- Potential outcomes under treatment `$z = 0,1$`:
`$$Y^{(z)}_{it} = g_z(\mathbf{X}_{it},\mathbf{u}_{it},r_{it}) + z_{it}\cdot \underbrace{\tau(\mathbf{X}_{it},\mathbf{u}_{it},r_{it})}_{\color{#E16462}{\style{font-family:inherit}{\text{Treat. Effect}}}} + \underbrace{\alpha_t}_{\color{#E16462}{\style{font-family:inherit}{\text{Period FE}}}}$$`
  - `$\mathbf{X}_{it}$`: Predictive covariates
  
  - `$\mathbf{u}_{it}$`: Unobserved confounders
  
  - `$\tau(\cdot)$`: Causal effect
  
---
# Two periods for GRD

<img src="mbennett_grd_files/figure-html/grd_setup-1.svg" style="display: block; margin: auto;" />
---
# A gradual approach

- Conditional expectations of potential outcomes, `$Y^{(z)}_t(R)$`:
`$$Y^{(0)}_0(R) = \mathbb{E}[Y^{(0)}_{i0}|R] = \mu_0(R)$$`
`$$Y^{(1)}_0(R) = \mathbb{E}[Y^{(1)}_{i0}|R] = \underbrace{\mu_0(R)}_{\color{#E16462}{\style{font-family:inherit}{\text{Avg. Outcome by R}}}} + \underbrace{\tau_0(R)}_{\color{#E16462}{\style{font-family:inherit}{\text{Treat. Effect by R}}}}$$`

- Identify generalization interval `$H = [H_{-},H_{+}]$` for `$t=0$`:
`$$R_{i} = h(\mathbf{X}_{i}) + \eta_i \ \ \forall \ R_i \in H$$`
--

- If `$H^* = \max\{|H|\}$` exists, then for a set of covariates `$\mathbf{X} = \mathbf{X}_T$`:
`$$Y^{(0)}_0(R')|\mathbf{X}_T = Y^{(0)}_0(R'')|\mathbf{X}_T \ \ \ \ \ \ \style{font-family:inherit}{\text{for any}} \ R', R'' \in H^*$$`

---
# Conditional outcome within the generalization interval

<img src="mbennett_grd_files/figure-html/grd_setup_pre-1.svg" style="display: block; margin: auto;" />
---
# Main assumption for generalization to t=1

<br>
<br>
.box-0[<b>Assumption: Conditional time-invariance under control</b>
`$$Y^{(0)}_0(R|\mathbf{X}) = Y_1^{(0)}(R|\mathbf{X}) + \alpha, \ \ \ \forall R \in H^*$$`]

- No changes in unobserved confounders between `$t=0$` and `$t=1$` for units within `$H^\ast$`

- Partially testable for `$Z=0$` in `$t=1$`

---
# Estimating an effect away from the cutoff

---

background-position: 50% 50%
class: left,middle, inverse, nonum
.big[
GRD in practice
]

---
# Context: Traditional matching

.center[
![:scale 60%](https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/RD/UCSD_20210420/images/diagram1_v2.svg)
]

---
# Context: Representative template matching with two samples

.center[
![:scale 60%](https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/RD/UCSD_20210420/images/diagram2_v3.svg)
]

---
# Context: Representative template matching

.center[
![:scale 60%](https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/RD/IC_20210312/images/diagram3v2.svg)
]

---
# Diagram for GRD

.center[
![:scale 80%](https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/RD/IC_20210312/images/diagram_grd.png)
]
---
# Step 0: Identification of narrow interval

.pull-left[
.center[
![](https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/RD/IC_20210312/images/diagram_grd1.png)
]
]

.pull-right[
<img src="mbennett_grd_files/figure-html/grd1-1.svg" style="display: block; margin: auto;" />
]

---
# Step 1: Template selection

.pull-left[
.center[
![](https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/RD/IC_20210312/images/diagram_grd2.png)
]
]

.pull-right[
<img src="mbennett_grd_files/figure-html/grd2-1.svg" style="display: block; margin: auto;" />
]