Modern machine learning approaches have led to performant diagnostic models for a variety of health conditions. Several machine learning approaches, such as decision trees and deep neural networks, can, in principle, approximate any function. However, this power can be considered to be both a gift and a curse, as the propensity toward overfitting is magnified when the input data are heterogeneous and high dimensional and the output class is highly nonlinear. This issue can especially plague diagnostic systems that predict behavioral and psychiatric conditions that are diagnosed with subjective criteria. An emerging solution to this issue is crowdsourcing, where crowd workers are paid to annotate complex behavioral features in return for monetary compensation or a gamified experience. These labels can then be used to derive a diagnosis, either directly or by using the labels as inputs to a diagnostic machine learning model. This viewpoint describes existing work in this emerging field and discusses ongoing challenges and opportunities with crowd-powered diagnostic systems, a nascent field of study. With the correct considerations, the addition of crowdsourcing to human-in-the-loop machine learning workflows for the prediction of complex and nuanced health conditions can accelerate screening, diagnostics, and ultimately access to care.
翻译:现代机器学习方法已为多种健康状况开发出高性能诊断模型。决策树与深度神经网络等若干机器学习方法,原则上能够逼近任意函数。然而这种能力可谓双刃剑:当输入数据具有高维度异质性且输出类别呈现高度非线性时,模型过拟合倾向会被显著放大。该问题尤其困扰那些基于主观标准诊断行为与精神类疾病的预测系统。新兴解决方案是采用众包模式,通过经济报酬或游戏化体验激励众包工作者标注复杂行为特征。这些标注可直接用于诊断推导,亦可作为诊断机器学习模型的输入数据。本文阐述这一新兴领域的现有研究成果,并探讨众包驱动诊断系统——这个处于萌芽阶段的研究领域——所面临的持续挑战与发展机遇。通过合理设计,将众包机制融入复杂微妙健康状况预测的人机协同机器学习工作流程,有望加速筛查诊断进程,并最终提升医疗服务的可及性。