This paper investigates how to best compare algorithms for predicting chronic homelessness for the purpose of identifying good candidates for housing programs. Predictive methods can rapidly refer potentially chronic shelter users to housing but also sometimes incorrectly identify individuals who will not become chronic (false positives). We use shelter access histories to demonstrate that these false positives are often still good candidates for housing. Using this approach, we compare a simple threshold method for predicting chronic homelessness to the more complex logistic regression and neural network algorithms. While traditional binary classification performance metrics show that the machine learning algorithms perform better than the threshold technique, an examination of the shelter access histories of the cohorts identified by the three algorithms show that they select groups with very similar characteristics. This has important implications for resource constrained not-for-profit organizations since the threshold technique can be implemented using much simpler information technology infrastructure than the machine learning algorithms.
翻译:本文研究如何最佳比较用于预测长期无家可归的算法,以识别适合住房项目的候选对象。预测方法能快速将潜在的长期庇护所使用者转介至住房,但也可能错误识别不会成为长期无家可归者的个体(误报)。我们利用庇护所访问历史数据证明,这些误报个体通常仍是合适的住房候选对象。采用这一思路,我们将一种简单的阈值方法与更复杂的逻辑回归及神经网络算法进行比较。传统二元分类性能指标显示,机器学习算法优于阈值技术,但对三种算法所识别群体的庇护所访问历史进行分析表明,它们所选中的群体具有高度相似的特征。这一发现对资源有限的非营利组织具有重要启示,因为阈值技术所需的信息技术基础设施远比机器学习算法简单。