U.S. state education agencies mark schools displaying achievement gaps between demographic subgroups as needing improvement. Some schools may have few students in these subgroups, such that average end-of-year test scores only noisily measure the average "true" score-the score one would expect if students took the test many times. This, in addition to the masking of small subgroup averages in publicly available assessment data, poses challenges for evaluating interventions aimed at closing achievement gaps. We introduce propensity score estimates designed to achieve balance on subgroup average true scores. These estimates are available even when noisy measurements are not and improve overlap compared to those that ignore measurement error, leading to greater bias reduction of matching estimators. We demonstrate our methods through simulation and an application to a statewide initiative in Texas for curbing summer learning loss.
翻译:美国州教育机构将人口统计亚群体之间显示成就差距的学校标记为需要改进。一些学校在这些亚群体中的学生可能很少,以至于学年末测试的平均分数只能粗略地衡量“真实”平均分数——即如果学生多次参加测试时所预期得到的分数。这一点,加上公开可用评估数据中掩盖小亚群体平均值的情况,给旨在缩小成就差距的干预措施评估带来了挑战。我们引入了旨在实现亚群体平均真实得分平衡的倾向得分估计方法。这些估计即使在噪声测量不可用时也能获得,并且与忽略测量误差的方法相比改善了重叠性,从而更大程度地降低了匹配估计量的偏差。我们通过模拟和对德克萨斯州一项旨在遏制暑期学习损失的州级倡议的应用来演示我们的方法。