We posit that data can only be safe to use up to a certain threshold of the data distribution shift, after which control must be relinquished by the autonomous system and operation halted or handed to a human operator. With the use of a computer vision toy example we demonstrate that network predictive accuracy is impacted by data distribution shifts and propose distance metrics between training and testing data to define safe operation limits within said shifts. We conclude that beyond an empirically obtained threshold of the data distribution shift, it is unreasonable to expect network predictive accuracy not to degrade
翻译:我们认为,数据仅能在数据分布偏移达到特定阈值前安全使用,超过该阈值后自主系统必须放弃控制权,停止运行或将操作移交人类操作员。通过一个计算机视觉的简化示例,我们证明了网络预测精度受数据分布偏移的影响,并提出了训练数据与测试数据间的距离度量方法,以在上述偏移范围内定义安全操作界限。我们的结论是:当数据分布偏移超过通过经验获得的阈值后,期望网络预测精度不下降是不合理的。