In an effort to quantify and combat sexual assault, US colleges and universities are required to disclose the number of reported sexual assaults on their campuses each year. However, many instances of sexual assault are never reported to authorities, and consequently the number of reported assaults does not fully reflect the true total number of assaults that occurred; the reported values could arise from many combinations of reporting rate and true incidence. In this paper we estimate these underlying quantities via a hierarchical Bayesian model of the reported number of assaults. We use informative priors, based on national crime statistics, to act as a tiebreaker to help distinguish between reporting rates and incidence. We outline a Hamiltonian Monte Carlo (HMC) sampling scheme for posterior inference regarding reporting rates and assault incidence at each school, and apply this method to campus sexual assault data from 2014-2019. Results suggest an increasing trend in reporting rates for the overall college population during this time. However, the extent of underreporting varies widely across schools. That variation has implications for how individual schools should interpret their reported crime statistics.
翻译:为量化并应对性侵问题,美国高校每年须公布校园内上报的性侵案件数量。然而,大量性侵事件从未向权威机构报告,导致上报数量无法完整反映实际发生的案件总数;同一上报数值可能对应多种报案率与真实发生率的组合。本文通过构建上报案件数量的分层贝叶斯模型,对这些潜在变量进行估计。我们基于全国犯罪统计数据设定信息性先验分布,以此作为区分报案率与发生率的判别依据。针对各校报案率与性侵发生率的后验推断,我们提出了哈密顿蒙特卡洛(HMC)采样方案,并将该方法应用于2014-2019年的校园性侵数据。结果表明,在此期间整体高校群体的报案率呈上升趋势。但不同学校间低报案率的程度存在显著差异,这种差异对单一学校应如何解读其上报的犯罪统计数据具有重要启示。