+ - 0:00:00
Notes for current slide
Notes for next slide


Beyond Exclusion:
The Role of High-Stake Testing on Attendance the Day of the Test







Causal Inference Seminar, Salem Center
April 5th, 2022


Magdalena Bennett   
The University of Texas at Austin   


Christopher Neilson   
Princeton University   


Nicolás Rojas   
Columbia University   




Motivation

  • Results from high-stakes tests widely used in education policy

    • E.g. funding, promotions, school closures, school choice, etc.

Motivation

  • Results from high-stakes tests widely used in education policy

    • E.g. funding, promotions, school closures, school choice, etc.

  • Assumption: Standardize tests used as a proxy of school quality

Motivation

  • Results from high-stakes tests widely used in education policy

    • E.g. funding, promotions, school closures, school choice, etc.

  • Assumption: Standardize tests used as a proxy of school quality


Is it so?

Motivation

Motivation

Motivation

  • Beyond explicit cheating and socioeconomic sorting: Students' exclusion

    • E.g.: Reclassification of low-performers as students with disabilities (Figlio & Loeb, 2011)
    • Use of disciplinary measures to exclude low-performers (Figlio, 2006)

Motivation

  • Beyond explicit cheating and socioeconomic sorting: Students' exclusion

    • E.g.: Reclassification of low-performers as students with disabilities (Figlio & Loeb, 2011)
    • Use of disciplinary measures to exclude low-performers (Figlio, 2006)

  • Less attention on non-representative attendance patterns

    • Differences between scores before and after imputation (Cuesta et al., 2020)

Motivation

  • Beyond explicit cheating and socioeconomic sorting: Students' exclusion

    • E.g.: Reclassification of low-performers as students with disabilities (Figlio & Loeb, 2011)
    • Use of disciplinary measures to exclude low-performers (Figlio, 2006)

  • Less attention on non-representative attendance patterns

    • Differences between scores before and after imputation (Cuesta et al., 2020)

  • Schools have incentives to game the system

    • Especially in high-accountability settings

This paper

Attendance Patterns

  • Event study approach:

    • How do these exclusions patterns look like? Are these the same for every (type of) school and every grade?

    • Focus beyond bottom performers

    • Robustness checks for alternative mechanisms

This paper

Attendance Patterns

  • Event study approach:

    • How do these exclusions patterns look like? Are these the same for every (type of) school and every grade?

    • Focus beyond bottom performers

    • Robustness checks for alternative mechanisms

Imputation Policies

  • Machine learning prediction:

    • Consequences of blanket policies in imputation of scores

    • Identification of schools that are most likely gaming the system

Outline

  1. Motivation

  2. Chilean educational context

  3. Attendance patterns:

    • Event study for different years, grades, and performance
    • Potential mechanisms
  4. Prediction approach:

    • Difference between predicted and observed distributions
    • Potential consecuences of imputation
  5. Conclusions and next steps








The Chilean Educational Context

The Chilean context: Standardized testing

  • Universal standardized testing since 1980's (SIMCE)

The Chilean context: Standardized testing

  • Universal standardized testing since 1980's (SIMCE)

  • SIMCE as high-stake testing:

    • Results widely available in a universal voucher system

    • Tied to teachers' bonuses

    • Tied to budget restrictions and school closures

SIMCE and absenteeism

  • Use of pre-filled communication for parents to be sent out by schools

    • Evidence that parents from lower-income students are less likely to receive information

SIMCE and absenteeism

  • Use of pre-filled communication for parents to be sent out by schools

    • Evidence that parents from lower-income students are less likely to receive information

  • No real consequences for low attendance:

    • Between 2005-2007, non-representative results where marked with symbols

    • No imputation strategy so far

SIMCE and absenteeism

  • Use of pre-filled communication for parents to be sent out by schools

    • Evidence that parents from lower-income students are less likely to receive information

  • No real consequences for low attendance:

    • Between 2005-2007, non-representative results where marked with symbols

    • No imputation strategy so far

  • Improvement of regulation for justifying students exclusion

    • E.g. specific disabilities (blindness) or non-Spanish speakers.








Attendance Patterns for the Day of the Test

How to evaluate the effect of "day of the test" on abstenteeism?

  • Some studies assessing the effect of attendance manipulation:

    • Focus on distortions (difference between imputed and observed scores) (Cuesta et al., 2020)

    • Manipulation for specific vulnerable schools (SEP) to raise scores (Feigenberg et al., 2019; Quezada & Hippel, 2017)

How to evaluate the effect of "day of the test" on abstenteeism?

  • Some studies assessing the effect of attendance manipulation:

    • Focus on distortions (difference between imputed and observed scores) (Cuesta et al., 2020)

    • Manipulation for specific vulnerable schools (SEP) to raise scores (Feigenberg et al., 2019; Quezada & Hippel, 2017)

  • This paper: Event study between 2011 and 2018 for all tested grades.

    • Focus on attendance by within-school performance

    • Use of alternative non-high-stake test to analyze potential mechanisms

    • Use of unpublished survey for communication and incentives around SIMCE

Data Available

  • Standardized tests 2011-2018 (SIMCE)

    • Scores at student and school level for different subjects (Math, Language, History, and Science)

    • Student's socioeconomic characterization (parental questionnaire)

Data Available

  • Standardized tests 2011-2018 (SIMCE)

    • Scores at student and school level for different subjects (Math, Language, History, and Science)

    • Student's socioeconomic characterization (parental questionnaire)

  • Daily attendance data 2011-2018 (SIGE)

    • Use for voucher payments (each day has ~ 2.5 million records)

Data Available

  • Standardized tests 2011-2018 (SIMCE)

    • Scores at student and school level for different subjects (Math, Language, History, and Science)

    • Student's socioeconomic characterization (parental questionnaire)

  • Daily attendance data 2011-2018 (SIGE)

    • Use for voucher payments (each day has ~ 2.5 million records)

  • GPA Performance 2011-2018 (Rendimiento)

    • Use GPA performance deciles within school-grade

Observations from our data

Data description
Grade Years tested Num Schools Num Students
2 2013, 2014, 2015 5,266 628,073
4 2011, 2013-2018 5,673 1,461,289
6 2013-2016, 2018 5,516 1,056,243
8 2011, 2013-2015, 2017 5,545 1,078,140
10 2013-2018 2,623 1,213,067

Empirical approach for difference in attendance

  • Event study centered around the day of the test:

Yipsgt=P=15T=45τPTDipsgtPTG+γpt+αi+ϵipsgt

Where

  • Yipsgt: Binary attendance for student i, from GPA group p, in school s and grade g, for day t (centered around the day of the test).

  • DipsgtPTG: Indicator variable I(p=P,t=T,g=G), where G is the tested grade. Coefficient of interest is DP0G.

Clear difference in attendance by performance for 2nd grade

No effect on lower performers for 10th grade

Attendance patterns differ by grade

Potential mechanisms that explain these patterns

  • Schools directly (des)incentivize attendance of (lower)higher performers


  • Students are excluded due to other reasons (justified).


  • Students experience a disutility from testing

Differences in communication and incentives between high and low performers

  • Schools directly (des)incentivize attendance of (lower)higher performers

    • 2017 survey for students in test-taking grades.
Results for 4th Grade
GPA Decile Told Notification Preparation Grades
D1 -0.06*** -0.11*** -0.08*** 0.14***
(0.00) (0.00) (0.00) (0.00)
D10 0.06*** 0.05*** 0.05*** -0.2***
(0.00) (0.00) (0.00) (0.00)
Baseline 0.89*** 0.87*** 0.89*** 0.39***
(0.00) (0.00) (0.00) (0.00)

Differences in communication and incentives between high and low performers

  • Schools directly (des)incentivize attendance of (lower)higher performers

    • 2017 survey for students in test-taking grades.
Results for 10th Grade
GPA Decile Told Notification Preparation Grades
D1 -0.02*** -0.01*** -0.02*** 0.05***
(0.00) (0.00) (0.00) (0.00)
D10 0.01*** 0.00 0.00 -0.03***
(0.00) (0.00) (0.00) (0.00)
Baseline 0.95*** 0.78*** 0.82*** 0.33***
(0.00) (0.00) (0.00) (0.00)

Use of exemptions to exclude students don't tell the whole story

  • Students are excluded due to other reasons (justified).

    • Change in exemption policy in 2012 reduction in exempted students (flattened)

    • Results remain similar after 2012

No self-selection from students because of testing

  • Students experience a disutility from testing

    • Use of no-stake test applied to schools No effect on attendance
Results for No-Stakes Test
Grade - Year D1 D2 D3D8 D9 D10
2nd 2011 -0.01 0.01 0.01* 0.02 0.00
(0.01) (0.01) (0.01) (0.01) (0.01)
5th 2012 0.00 -0.01 0.00 0.01 0.01
(0.01) (0.01) (0.01) (0.01) (0.01)
6th 2011 0.02* 0.01 0.01** 0.01 0.00
(0.01) (0.01) (0.00) (0.01) (0.01)
6th 2017 0.00 0.03 0.01 0.01 0.00
(0.02) (0.02) (0.01) (0.02) (0.01)
11th 2012 0.00 0.00 0.00 -0.02** 0.00
(0.01) (0.01) (0.00) (0.01) (0.01)








Predicting the Counterfactual

How do these results compare to predicted counterfactual?

  • Can we use this existing rich panel data to predict attendance on the day of the test as if it was a regular day?

How do these results compare to predicted counterfactual?

  • Can we use this existing rich panel data to predict attendance on the day of the test as if it was a regular day?

  • Use XGBoost with panel data for attendance prediction

    • Model includes fixed effects by day of the week, school, grade, and student.

    • Also includes sibling attendance and attendance lag.

How do these results compare to predicted counterfactual?

  • Can we use this existing rich panel data to predict attendance on the day of the test as if it was a regular day?

  • Use XGBoost with panel data for attendance prediction

    • Model includes fixed effects by day of the week, school, grade, and student.

    • Also includes sibling attendance and attendance lag.

  • Use data for 4th grade (2017):

    • Data before the test to predict attendance on the day of the test.

Overall predictions over performance distribution

Example: Comparisons between schools?

Example: Comparisons between schools?

Can we characterize these schools?

Schools that appear to exclude lower-perfoming students are also more vulnerable

Cluster 1 (N=1092)
Cluster 2 (N=266)
Mean Std. Dev. Mean Std. Dev. Diff. in Means p
Avg. SIMCE Lang 258.72 22.67 251.76 22.00 -6.96 0.00
Avg. SIMCE Math 254.69 25.79 245.80 24.39 -8.89 0.00
Public 0.34 0.47 0.42 0.50 0.09 0.01
SEP status 0.84 0.37 0.88 0.32 0.04 0.06
% Priority Students 0.48 0.19 0.55 0.19 0.06 0.00
Diff D1 GPA 0.02 0.14 -0.29 0.32 -0.31 0.00
Diff D2 GPA 0.04 0.10 -0.20 0.24 -0.24 0.00
Diff D9 GPA 0.04 0.07 -0.01 0.14 -0.05 0.00
Diff D10 GPA 0.04 0.06 -0.01 0.13 -0.05 0.00
Note: Diff DX GPA represents the difference between obs. attendance and predicted attendance for decile X

Implications for imputation policies

  • How to handle this absenteeism problem?

    • Scenario 1: Observed attendance (no imputation)

    • Scenario 2: Attendance as if the test hadn't happened (impute "typical day")

    • Scenario 3: Everybody is present

Implications for imputation policies

  • How to handle this absenteeism problem?

    • Scenario 1: Observed attendance (no imputation)

    • Scenario 2: Attendance as if the test hadn't happened (impute "typical day")

    • Scenario 3: Everybody is present

  • Proposals to impute lowest scores for absent students to desincentivize arbitrary exclusion

    • Most vulnerable schools have higher absenteeism rates Increase inequality and non-representativeness








Let's Wrap Up...

Conclusions and next steps

  • Non-representative patterns of absenteeism beyond exclusion of low-performers

    • High heterogeneity between schools

Conclusions and next steps

  • Non-representative patterns of absenteeism beyond exclusion of low-performers

    • High heterogeneity between schools

  • Communication strategies play important role for lower-performing students

Conclusions and next steps

  • Non-representative patterns of absenteeism beyond exclusion of low-performers

    • High heterogeneity between schools

  • Communication strategies play important role for lower-performing students

  • Impact of imputation policies?

    • Work in progress: How does non-representativeness and different imputation strategies impact policies and information provision?


Beyond Exclusion:
The Role of High-Stake Testing on Attendance the Day of the Test







Causal Inference Seminar, Salem Center
April 5th, 2022


Magdalena Bennett   
The University of Texas at Austin   


Christopher Neilson   
Princeton University   


Nicolás Rojas   
Columbia University   




Motivation

  • Results from high-stakes tests widely used in education policy

    • E.g. funding, promotions, school closures, school choice, etc.
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow