In today's world, AI programs powered by Machine Learning are ubiquitous, and have achieved seemingly exceptional performance across a broad range of tasks, from medical diagnosis and credit rating in banking, to theft detection via video analysis, and even predicting political or sexual orientation from facial images. These predominantly deep learning methods excel due to their extraordinary capacity to process vast amounts of complex data to extract complex correlations and relationship from different levels of features. In this paper, we contend that the designers and final users of these ML methods have forgotten a fundamental lesson from statistics: correlation does not imply causation. Not only do most state-of-the-art methods neglect this crucial principle, but by doing so they often produce nonsensical or flawed causal models, akin to social astrology or physiognomy. Consequently, we argue that current efforts to make AI models more ethical by merely reducing biases in the training data are insufficient. Through examples, we will demonstrate that the potential for harm posed by these methods can only be mitigated by a complete rethinking of their core models, improved quality assessment metrics and policies, and by maintaining humans oversight throughout the process.
翻译:在当今世界,由机器学习驱动的人工智能程序已无处不在,并在从医疗诊断与银行信用评级,到通过视频分析进行盗窃检测,乃至从面部图像预测政治或性取向等广泛任务中取得了看似卓越的性能。这些以深度学习为主的方法之所以表现出色,源于其处理海量复杂数据以从不同层次特征中提取复杂关联与关系的非凡能力。本文认为,这些机器学习方法的设计者和最终用户已遗忘统计学的一个基本教训:相关性不意味着因果性。不仅大多数前沿方法忽视了这一关键原则,而且这种做法常常产生荒谬或有缺陷的因果模型,类似于社会占星术或面相学。因此,我们认为当前仅通过减少训练数据中的偏见来提升人工智能模型伦理性的努力是远远不够的。通过具体案例,我们将证明这些方法可能造成的危害只能通过对其核心模型的彻底反思、改进的质量评估指标与政策,以及在整个过程中保持人类监督来缓解。