Investigating serious crimes is inherently complex and resource-constrained. Law enforcement agencies (LEAs) grapple with overwhelming volumes of offender and incident data, making effective suspect identification difficult. Although machine learning (ML)-enabled systems have been explored to support LEAs, several have failed in practice. This highlights the need to align system behavior with stakeholder goals early in development, motivating the use of Goal-Oriented Requirements Engineering (GORE). This paper reports our experience applying the GORE framework KAOS to designing an ML-enabled system for identifying suspects in online child sexual abuse. We describe how KAOS supported early requirements elaboration, including goal refinement, object modeling, agent assignment, and operationalization. A key finding is the central role of data elicitation: data requirements constrain refinement choices and candidate agents while influencing how goals are linked, operationalized, and satisfied. Conversely, goal elaboration and agent assignment shape data quality expectations and collection needs. Our experience highlights the iterative, bidirectional dependencies between goals, data, and ML performance. We contribute a reference model for integrating GORE with data-driven system development, and identify gaps in KAOS, particularly the need for explicit support for data elicitation and quality management. These insights inform future extensions of KAOS and, more broadly, the application of formal GORE methods to ML-enabled systems for high-stakes societal contexts.
翻译:调查严重犯罪本质上具有复杂性和资源受限性。执法机构面临着海量的犯罪者与事件数据,使得有效识别嫌疑人变得困难。尽管已有研究探索利用支持机器学习的系统来辅助执法机构,但其中多个系统在实践中未能成功。这凸显了在开发早期使系统行为与利益相关者目标保持一致的必要性,从而推动了面向目标的需求工程方法的应用。本文报告了我们在设计一个用于识别在线儿童性虐待嫌疑人的支持机器学习系统时,应用KAOS这一GORE框架的经验。我们描述了KAOS如何支持早期需求细化,包括目标精化、对象建模、主体分配和可操作化。一个关键发现是数据获取的核心作用:数据需求限制了精化选择和候选主体,同时影响着目标的关联方式、可操作化过程及满足条件。反之,目标细化和主体分配也塑造了对数据质量的期望和收集需求。我们的经验凸显了目标、数据与机器学习性能之间迭代的、双向的依赖关系。我们贡献了一个将GORE与数据驱动系统开发相集成的参考模型,并指出了KAOS存在的不足,特别是其对数据获取和质量管理缺乏明确支持。这些见解为KAOS的未来扩展,以及更广泛地将形式化GORE方法应用于高风险社会背景下的支持机器学习系统提供了参考。