Not only low-performers attend less on the day of the test in lower grades, but high-performers attend more
Using machine learning methods we can also identify schools more likely to incentivize low-attendance on bottom performers
1 McCombs School of Business, The University of Texas at Austin
2 Economics Department, Princeton University
3 Teachers College, Columbia University
Non-representative patterns of attendance can skew how useful test scores measures are for accomplishing their goal. The main objectives of this paper are the following:
Understand the average effect of testing on school attendance across grades and performance
Identify schools that incentivize non-representative patterns of attendance by combining causal inference methods and machine learning
Help improve current imputation methods
\[Y_{ipsgt} = \sum_{P=1}^5\sum_{T=-4}^5 \tau^{PT}D^{PTG^*}_{ipsgt} + \gamma_{pt} +\alpha_i + \epsilon_{ipsgt}\]
\(Y_{ipsgt}\): Attendance (1,0) for student \(i\), from GPA group \(p\), in school \(s\) and grade \(g\) for day \(t\).
\(D^{PTG^*}_{ipsgt}\): Indicator variable where \(G^*\) is the tested grade.
Students skip school on the day of the test. In lower grades, lower-performers attend less and higher-performers attend more, compared to a regular day. In higher grades, we only observe action at the top of the distribution
There is important heterogeneity betweeen schools.
We use K-means analysis to identify clusters of schools according to their difference between predicted and observed attendance distribution. We find two main clusters, where one of them incentivizes the exclusion of lower-performers. Those schools are more vulnerable and have overall lower performance.
In terms of imputation:
Not only low-performers attend less on the day of the test in lower grades, but high-performers attend more
Using machine learning methods we can also identify schools more likely to incentivize low-attendance on bottom performers