Across the dynamic business landscape today, enterprises face an ever-increasing range of challenges. These include the constantly evolving regulatory environment, the growing demand for personalization within software applications, and the heightened emphasis on governance. In response to these multifaceted demands, large enterprises have been adopting automation that spans from the optimization of core business processes to the enhancement of customer experiences. Indeed, Artificial Intelligence (AI) has emerged as a pivotal element of modern software systems. In this context, data plays an indispensable role. AI-centric software systems based on supervised learning and operating at an industrial scale require large volumes of training data to perform effectively. Moreover, the incorporation of generative AI has led to a growing demand for adequate evaluation benchmarks. Our experience in this field has revealed that the requirement for large datasets for training and evaluation introduces a host of intricate challenges. This book chapter explores the evolving landscape of Software Engineering (SE) in general, and Requirements Engineering (RE) in particular, in this era marked by AI integration. We discuss challenges that arise while integrating Natural Language Processing (NLP) and generative AI into enterprise-critical software systems. The chapter provides practical insights, solutions, and examples to equip readers with the knowledge and tools necessary for effectively building solutions with NLP at their cores. We also reflect on how these text data-centric tasks sit together with the traditional RE process. We also highlight new RE tasks that may be necessary for handling the increasingly important text data-centricity involved in developing software systems.
翻译:在当今动态变化的商业环境中,企业面临日益多样化的挑战,包括持续演进的监管环境、软件应用中不断增长的个性化需求,以及对治理效能的更高要求。为应对这些多维度需求,大型企业正广泛采用自动化技术,其应用范围涵盖核心业务流程优化至客户体验提升。人工智能已成为现代软件系统的关键组成部分。在此背景下,数据发挥着不可或缺的作用。基于监督学习并以工业规模运行的AI驱动型软件系统,需要海量训练数据以有效执行任务。此外,生成式人工智能的融入进一步催生了对充分评估基准的需求。我们的实践经验表明,用于训练和评估的大规模数据集需求会引发一系列复杂挑战。本章探讨了在人工智能融合时代下,软件工程(SE)整体及需求工程(RE)领域的演变趋势。我们讨论了将自然语言处理(NLP)与生成式人工智能集成到企业关键型软件系统时面临的挑战。通过提供实践见解、解决方案及案例,本章旨在帮助读者掌握以NLP为核心高效构建解决方案的知识与工具。同时,我们反思了这些以文本数据为中心的任务如何与传统需求工程流程相结合,并指出了在软件开发过程中处理日益重要的文本数据核心化问题时可能需要的全新需求工程任务。