Machine learning (ML) components are increasingly integrated into software products, yet their complexity and inherent uncertainty often lead to unintended and hazardous consequences, both for individuals and society at large. Despite these risks, practitioners seldom adopt proactive approaches to anticipate and mitigate hazards before they occur. Traditional safety engineering approaches, such as Failure Mode and Effects Analysis (FMEA) and System Theoretic Process Analysis (STPA), offer systematic frameworks for early risk identification but are rarely adopted. This position paper advocates for integrating hazard analysis into the development of any ML-powered software product and calls for greater support to make this process accessible to developers. By using large language models (LLMs) to partially automate a modified STPA process with human oversight at critical steps, we expect to address two key challenges: the heavy dependency on highly experienced safety engineering experts, and the time-consuming, labor-intensive nature of traditional hazard analysis, which often impedes its integration into real-world development workflows. We illustrate our approach with a running example, demonstrating that many seemingly unanticipated issues can, in fact, be anticipated.
翻译:机器学习(ML)组件正日益集成到软件产品中,然而其复杂性和固有的不确定性常导致对个人乃至整个社会产生非预期且危险的后果。尽管存在这些风险,从业者却很少采用主动方法来预测和缓解危险事件的发生。传统安全工程方法,如失效模式与影响分析(FMEA)和系统理论过程分析(STPA),为早期风险识别提供了系统性框架,但实际应用却十分有限。本立场论文主张将危险分析整合到任何基于机器学习的软件产品开发过程中,并呼吁提供更多支持以使该过程对开发者更具可操作性。通过利用大语言模型(LLMs)在关键步骤引入人工监督,以部分自动化改进的STPA流程,我们期望解决两个关键挑战:一是对经验丰富安全工程专家的高度依赖;二是传统危险分析耗时耗力、劳动密集的特性,这常常阻碍其融入实际开发工作流。我们通过一个贯穿始终的示例阐释了该方法,证明许多看似无法预见的问题实际上是可以被预见的。