class: inverse, center, middle <br> .left[ <h1 class="title-own">Beyond Exclusion:<br/>The Role of High-Stake Testing on Attendance the Day of the Test</h1>] .pull-left-little_l[ <br> <br> <br> <br> <br> <br> .left[.small[Causal Inference Seminar, Salem Center<br>April 5th, 2022]]] .pull-right-little_l[ <br> .small[.right[Magdalena Bennett <br>*The University of Texas at Austin* ]] <br> .small[.right[Christopher Neilson <br>*Princeton University* ]] <br> .small[.right[Nicolás Rojas <br>*Columbia University* ]] ] <br> <br> <br> --- # Motivation - Results from **.coolblue[high-stakes tests]** widely used in education policy - E.g. funding, promotions, school closures, school choice, etc. -- <br> <br> - **.coolblue[Assumption]**: Standardize tests used as a proxy of school quality -- <br> .box-3Trans[Is it so?] --- # Motivation .pull-left[ .center[ ![:scale 75%](https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/Attendance/Salem_20220405/images/test_scandal1.png) ] ] -- .pull-right[ .center[ ![:scale 75%](https://raw.githubusercontent.com/maibennett/presentations/main/content/presentations/Attendance/Salem_20220405/images/test_discrimination.png) ] ] --- # Motivation - Beyond explicit cheating and socioeconomic sorting: **.coolblue[Students' exclusion]** - E.g.: Reclassification of low-performers as students with disabilities .small[(Figlio & Loeb, 2011)] - Use of disciplinary measures to exclude low-performers .small[(Figlio, 2006)] -- <br> <br> - Less attention on **.coolblue[non-representative attendance patterns]** - Differences between scores before and after imputation .small[(Cuesta et al., 2020)] -- <br> <br> - Schools have **.coolblue[incentives]** to game the system - Especially in **.coolblue[high-accountability settings]** --- # This paper **.coolblue[Attendance Patterns]** - Event study approach: - *How do these exclusions patterns look like? Are these the same for every (type of) school and every grade?* - Focus beyond bottom performers - Robustness checks for alternative mechanisms -- **.coolblue[Imputation Policies]** - Machine learning prediction: - Consequences of blanket policies in imputation of scores - Identification of schools that are most likely gaming the system --- # Outline 1. Motivation 2. Chilean educational context 3. Attendance patterns: - .small[Event study for different years, grades, and performance] - .small[Potential mechanisms] 4. Prediction approach: - .small[Difference between predicted and observed distributions] - .small[Potential consecuences of imputation] 5. Conclusions and next steps --- background-position: 50% 50% class: left, middle, inverse <br> <br> <br> <br> <br> <br> <br> .big[ The Chilean Educational Context ] --- # The Chilean context: Standardized testing - Universal standardized testing since 1980's (SIMCE) -- <br> <br> - SIMCE as **.coolblue[high-stake]** testing: - Results widely available in a universal voucher system - Tied to teachers' bonuses - Tied to budget restrictions and school closures --- # SIMCE and absenteeism - Use of **.coolblue[pre-filled communication]** for parents to be sent out by schools - Evidence that parents from lower-income students are less likely to receive information -- <br> <br> - **.coolblue[No real consequences for low attendance]**: - Between 2005-2007, non-representative results where marked with symbols - No imputation strategy so far -- <br> <br> - Improvement of regulation for **.coolblue[justifying students exclusion]** - E.g. specific disabilities (blindness) or non-Spanish speakers. --- background-position: 50% 50% class: left, middle, inverse <br> <br> <br> <br> <br> <br> <br> .big[ Attendance Patterns for the Day of the Test ] --- # How to evaluate the effect of "day of the test" on abstenteeism? - Some studies assessing the **.coolblue[effect of attendance manipulation]**: - Focus on distortions (difference between imputed and observed scores) .small[(Cuesta et al., 2020)] - Manipulation for specific vulnerable schools (SEP) to raise scores .small[(Feigenberg et al., 2019; Quezada & Hippel, 2017)] -- <br> <br> - **.coolblue[This paper]**: Event study between 2011 and 2018 for all tested grades. - Focus on attendance by within-school performance - Use of alternative non-high-stake test to analyze potential mechanisms - Use of unpublished survey for communication and incentives around SIMCE --- # Data Available - **.coolblue[Standardized tests 2011-2018 (SIMCE)]** - Scores at student and school level for different subjects (Math, Language, History, and Science) - Student's socioeconomic characterization (parental questionnaire) -- <br> <br> - **.coolblue[Daily attendance data 2011-2018 (SIGE)]** - Use for voucher payments (each day has ~ 2.5 million records) -- <br> <br> - **.coolblue[GPA Performance 2011-2018 (Rendimiento)]** - Use GPA performance deciles within school-grade --- # Observations from our data <table class=" lightable-paper" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <caption>Data description</caption> <thead> <tr> <th style="text-align:left;"> Grade </th> <th style="text-align:center;"> Years tested </th> <th style="text-align:center;"> Num Schools </th> <th style="text-align:center;"> Num Students </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 2 </td> <td style="text-align:center;"> 2013, 2014, 2015 </td> <td style="text-align:center;"> 5,266 </td> <td style="text-align:center;"> 628,073 </td> </tr> <tr> <td style="text-align:left;"> 4 </td> <td style="text-align:center;"> 2011, 2013-2018 </td> <td style="text-align:center;"> 5,673 </td> <td style="text-align:center;"> 1,461,289 </td> </tr> <tr> <td style="text-align:left;"> 6 </td> <td style="text-align:center;"> 2013-2016, 2018 </td> <td style="text-align:center;"> 5,516 </td> <td style="text-align:center;"> 1,056,243 </td> </tr> <tr> <td style="text-align:left;"> 8 </td> <td style="text-align:center;"> 2011, 2013-2015, 2017 </td> <td style="text-align:center;"> 5,545 </td> <td style="text-align:center;"> 1,078,140 </td> </tr> <tr> <td style="text-align:left;"> 10 </td> <td style="text-align:center;"> 2013-2018 </td> <td style="text-align:center;"> 2,623 </td> <td style="text-align:center;"> 1,213,067 </td> </tr> </tbody> </table> --- # Empirical approach for difference in attendance - Event study centered around the day of the test: `$$Y_{ipsgt} = \sum_{P=1}^5\sum_{T=-4}^5 \tau^{PT}D^{PTG^*}_{ipsgt} + \gamma_{pt} +\alpha_i + \epsilon_{ipsgt}$$` Where - `\(Y_{ipsgt}\)`: Binary attendance for student `\(i\)`, from GPA group `\(p\)`, in school `\(s\)` and grade `\(g\)`, for day `\(t\)` (centered around the day of the test). - `\(D^{PTG^*}_{ipsgt}\)`: Indicator variable `\(\mathrm{I(p = P, t = T, g = G^*)}\)`, where `\(G^*\)` is the tested grade. Coefficient of interest is `\(D^{P0G^*}\)`. --- # Clear difference in attendance by performance for 2nd grade <img src="mbennett_attendance_files/figure-html/event_study_plot2nd-1.svg" style="display: block; margin: auto;" /> --- # No effect on lower performers for 10th grade <img src="mbennett_attendance_files/figure-html/event_study_plot10th-1.svg" style="display: block; margin: auto;" /> --- # Attendance patterns differ by grade <img src="mbennett_attendance_files/figure-html/event_study_plot-1.svg" style="display: block; margin: auto;" /> --- # Potential mechanisms that explain these patterns - Schools directly **.coolblue[(des)incentivize attendance of (lower)higher performers]** <br> - Students are **.coolblue[excluded due to other reasons]** (justified). <br> - Students experience a **.coolblue[disutility from testing]** --- # Differences in communication and incentives between high and low performers - Schools directly **.coolblue[(des)incentivize attendance of (lower)higher performers]** - 2017 survey for students in test-taking grades. <table class=" lightable-paper" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <caption>Results for 4th Grade</caption> <thead> <tr> <th style="text-align:left;"> GPA Decile </th> <th style="text-align:center;"> Told </th> <th style="text-align:center;"> Notification </th> <th style="text-align:center;"> Preparation </th> <th style="text-align:center;"> Grades </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> D1 </td> <td style="text-align:center;"> -0.06*** </td> <td style="text-align:center;"> -0.11*** </td> <td style="text-align:center;"> -0.08*** </td> <td style="text-align:center;"> 0.14*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> </tr> <tr> <td style="text-align:left;"> D10 </td> <td style="text-align:center;"> 0.06*** </td> <td style="text-align:center;"> 0.05*** </td> <td style="text-align:center;"> 0.05*** </td> <td style="text-align:center;"> -0.2*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> </tr> <tr> <td style="text-align:left;"> Baseline </td> <td style="text-align:center;"> 0.89*** </td> <td style="text-align:center;"> 0.87*** </td> <td style="text-align:center;"> 0.89*** </td> <td style="text-align:center;"> 0.39*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> </tr> </tbody> </table> --- # Differences in communication and incentives between high and low performers - Schools directly **.coolblue[(des)incentivize attendance of (lower)higher performers]** - 2017 survey for students in test-taking grades. <table class=" lightable-paper" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <caption>Results for 10th Grade</caption> <thead> <tr> <th style="text-align:left;"> GPA Decile </th> <th style="text-align:center;"> Told </th> <th style="text-align:center;"> Notification </th> <th style="text-align:center;"> Preparation </th> <th style="text-align:center;"> Grades </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> D1 </td> <td style="text-align:center;"> -0.02*** </td> <td style="text-align:center;"> -0.01*** </td> <td style="text-align:center;"> -0.02*** </td> <td style="text-align:center;"> 0.05*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> </tr> <tr> <td style="text-align:left;"> D10 </td> <td style="text-align:center;"> 0.01*** </td> <td style="text-align:center;"> 0.00 </td> <td style="text-align:center;"> 0.00 </td> <td style="text-align:center;"> -0.03*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> </tr> <tr> <td style="text-align:left;"> Baseline </td> <td style="text-align:center;"> 0.95*** </td> <td style="text-align:center;"> 0.78*** </td> <td style="text-align:center;"> 0.82*** </td> <td style="text-align:center;"> 0.33*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.00) </td> </tr> </tbody> </table> --- # Use of exemptions to exclude students don't tell the whole story - Students are **.coolblue[excluded due to other reasons]** (justified). - Change in exemption policy in 2012 `\(\rightarrow\)` reduction in exempted students (flattened) - Results remain similar after 2012 --- # No self-selection from students because of testing .pull-left[ - Students experience a **.coolblue[disutility from testing]** - Use of **.coolblue[no-stake test]** applied to schools `\(\rightarrow\)` No effect on attendance ] .pull-right[ .small[ <table class=" lightable-paper" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <caption>Results for No-Stakes Test</caption> <thead> <tr> <th style="text-align:left;"> Grade - Year </th> <th style="text-align:center;"> D1 </th> <th style="text-align:center;"> D2 </th> <th style="text-align:center;"> D3D8 </th> <th style="text-align:center;"> D9 </th> <th style="text-align:center;"> D10 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 2nd 2011 </td> <td style="text-align:center;"> -0.01 </td> <td style="text-align:center;"> 0.01 </td> <td style="text-align:center;"> 0.01* </td> <td style="text-align:center;"> 0.02 </td> <td style="text-align:center;"> 0.00 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> </tr> <tr> <td style="text-align:left;"> 5th 2012 </td> <td style="text-align:center;"> 0.00 </td> <td style="text-align:center;"> -0.01 </td> <td style="text-align:center;"> 0.00 </td> <td style="text-align:center;"> 0.01 </td> <td style="text-align:center;"> 0.01 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> </tr> <tr> <td style="text-align:left;"> 6th 2011 </td> <td style="text-align:center;"> 0.02* </td> <td style="text-align:center;"> 0.01 </td> <td style="text-align:center;"> 0.01** </td> <td style="text-align:center;"> 0.01 </td> <td style="text-align:center;"> 0.00 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> </tr> <tr> <td style="text-align:left;"> 6th 2017 </td> <td style="text-align:center;"> 0.00 </td> <td style="text-align:center;"> 0.03 </td> <td style="text-align:center;"> 0.01 </td> <td style="text-align:center;"> 0.01 </td> <td style="text-align:center;"> 0.00 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.02) </td> <td style="text-align:center;"> (0.02) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.02) </td> <td style="text-align:center;"> (0.01) </td> </tr> <tr> <td style="text-align:left;"> 11th 2012 </td> <td style="text-align:center;"> 0.00 </td> <td style="text-align:center;"> 0.00 </td> <td style="text-align:center;"> 0.00 </td> <td style="text-align:center;"> -0.02** </td> <td style="text-align:center;"> 0.00 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.00) </td> <td style="text-align:center;"> (0.01) </td> <td style="text-align:center;"> (0.01) </td> </tr> </tbody> </table> ] ] --- background-position: 50% 50% class: left, middle, inverse <br> <br> <br> <br> <br> <br> <br> .big[ Predicting the Counterfactual ] --- # How do these results compare to predicted counterfactual? - Can we use **.coolblue[this existing rich panel data]** to predict attendance on the day of the test *as if it was a regular day*? -- <br> <br> - Use **.coolblue[XGBoost]** with panel data for **.coolblue[attendance prediction]** - Model includes fixed effects by day of the week, school, grade, and student. - Also includes sibling attendance and attendance lag. -- <br> <br> - Use data for 4th grade (2017): - Data **.coolblue[before]** the test to **.coolblue[predict attendance on the day of the test]**. --- # Overall predictions over performance distribution <img src="mbennett_attendance_files/figure-html/prediction_all-1.svg" style="display: block; margin: auto;" /> --- # Example: Comparisons between schools? <img src="mbennett_attendance_files/figure-html/prediction_example-1.svg" style="display: block; margin: auto;" /> --- # Example: Comparisons between schools? <img src="mbennett_attendance_files/figure-html/prediction_example2-1.svg" style="display: block; margin: auto;" /> --- # Can we characterize these schools? <img src="mbennett_attendance_files/figure-html/cluster_plot-1.svg" style="display: block; margin: auto;" /> --- # Schools that appear to exclude lower-perfoming students are also more vulnerable .smaller[ <table style='NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto; font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;' class="table lightable-paper"> <thead> <tr> <th style="empty-cells: hide;border-bottom:hidden;" colspan="1"></th> <th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Cluster 1 (N=1092)</div></th> <th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Cluster 2 (N=266)</div></th> <th style="empty-cells: hide;border-bottom:hidden;" colspan="2"></th> </tr> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Mean </th> <th style="text-align:center;"> Std. Dev. </th> <th style="text-align:center;"> Mean </th> <th style="text-align:center;"> Std. Dev. </th> <th style="text-align:center;"> Diff. in Means </th> <th style="text-align:center;"> p </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Avg. SIMCE Lang </td> <td style="text-align:center;"> 258.72 </td> <td style="text-align:center;"> 22.67 </td> <td style="text-align:center;"> 251.76 </td> <td style="text-align:center;"> 22.00 </td> <td style="text-align:center;"> -6.96 </td> <td style="text-align:center;"> 0.00 </td> </tr> <tr> <td style="text-align:left;"> Avg. SIMCE Math </td> <td style="text-align:center;"> 254.69 </td> <td style="text-align:center;"> 25.79 </td> <td style="text-align:center;"> 245.80 </td> <td style="text-align:center;"> 24.39 </td> <td style="text-align:center;"> -8.89 </td> <td style="text-align:center;"> 0.00 </td> </tr> <tr> <td style="text-align:left;"> Public </td> <td style="text-align:center;"> 0.34 </td> <td style="text-align:center;"> 0.47 </td> <td style="text-align:center;"> 0.42 </td> <td style="text-align:center;"> 0.50 </td> <td style="text-align:center;"> 0.09 </td> <td style="text-align:center;"> 0.01 </td> </tr> <tr> <td style="text-align:left;"> SEP status </td> <td style="text-align:center;"> 0.84 </td> <td style="text-align:center;"> 0.37 </td> <td style="text-align:center;"> 0.88 </td> <td style="text-align:center;"> 0.32 </td> <td style="text-align:center;"> 0.04 </td> <td style="text-align:center;"> 0.06 </td> </tr> <tr> <td style="text-align:left;"> % Priority Students </td> <td style="text-align:center;"> 0.48 </td> <td style="text-align:center;"> 0.19 </td> <td style="text-align:center;"> 0.55 </td> <td style="text-align:center;"> 0.19 </td> <td style="text-align:center;"> 0.06 </td> <td style="text-align:center;"> 0.00 </td> </tr> <tr> <td style="text-align:left;"> Diff D1 GPA </td> <td style="text-align:center;"> 0.02 </td> <td style="text-align:center;"> 0.14 </td> <td style="text-align:center;"> -0.29 </td> <td style="text-align:center;"> 0.32 </td> <td style="text-align:center;"> -0.31 </td> <td style="text-align:center;"> 0.00 </td> </tr> <tr> <td style="text-align:left;"> Diff D2 GPA </td> <td style="text-align:center;"> 0.04 </td> <td style="text-align:center;"> 0.10 </td> <td style="text-align:center;"> -0.20 </td> <td style="text-align:center;"> 0.24 </td> <td style="text-align:center;"> -0.24 </td> <td style="text-align:center;"> 0.00 </td> </tr> <tr> <td style="text-align:left;"> Diff D9 GPA </td> <td style="text-align:center;"> 0.04 </td> <td style="text-align:center;"> 0.07 </td> <td style="text-align:center;"> -0.01 </td> <td style="text-align:center;"> 0.14 </td> <td style="text-align:center;"> -0.05 </td> <td style="text-align:center;"> 0.00 </td> </tr> <tr> <td style="text-align:left;"> Diff D10 GPA </td> <td style="text-align:center;"> 0.04 </td> <td style="text-align:center;"> 0.06 </td> <td style="text-align:center;"> -0.01 </td> <td style="text-align:center;"> 0.13 </td> <td style="text-align:center;"> -0.05 </td> <td style="text-align:center;"> 0.00 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> Note: Diff DX GPA represents the difference between obs. attendance and predicted attendance for decile X</td></tr></tfoot> </table> ] --- # Implications for imputation policies - **.coolblue[How to handle this absenteeism problem?]** - Scenario 1: Observed attendance (no imputation) - Scenario 2: Attendance as if the test hadn't happened (impute "typical day") - Scenario 3: Everybody is present -- <br> <br> - Proposals to impute **.coolblue[lowest scores for absent students]** to desincentivize arbitrary exclusion - Most vulnerable schools have higher absenteeism rates `\(\rightarrow\)` Increase inequality and non-representativeness --- background-position: 50% 50% class: left, middle, inverse <br> <br> <br> <br> <br> <br> <br> .big[ Let's Wrap Up... ] --- # Conclusions and next steps - Non-representative patterns of absenteeism **.coolblue[beyond exclusion of low-performers]** - High heterogeneity between schools -- <br> <br> - **.coolblue[Communication strategies]** play important role for **.coolblue[lower-performing students]** -- <br> <br> - Impact of **.coolblue[imputation policies]**? - Work in progress: How does non-representativeness and different imputation strategies impact policies and information provision? --- class: inverse, center, middle <br> .left[ <h1 class="title-own">Beyond Exclusion:<br/>The Role of High-Stake Testing on Attendance the Day of the Test</h1>] .pull-left-little_l[ <br> <br> <br> <br> <br> <br> .left[.small[Causal Inference Seminar, Salem Center<br>April 5th, 2022]]] .pull-right-little_l[ <br> .small[.right[Magdalena Bennett <br>*The University of Texas at Austin* ]] <br> .small[.right[Christopher Neilson <br>*Princeton University* ]] <br> .small[.right[Nicolás Rojas <br>*Columbia University* ]] ] <br> <br> <br>