Large Language Models (LLMs) have emerged as a transformative AI paradigm, profoundly influencing daily life through their exceptional language understanding and contextual generation capabilities. Despite their remarkable performance, LLMs face a critical challenge: the propensity to produce unreliable outputs due to the inherent limitations of their learning-based nature. Formal methods (FMs), on the other hand, are a well-established computation paradigm that provides mathematically rigorous techniques for modeling, specifying, and verifying the correctness of systems. FMs have been extensively applied in mission-critical software engineering, embedded systems, and cybersecurity. However, the primary challenge impeding the deployment of FMs in real-world settings lies in their steep learning curves, the absence of user-friendly interfaces, and issues with efficiency and adaptability. This position paper outlines a roadmap for advancing the next generation of trustworthy AI systems by leveraging the mutual enhancement of LLMs and FMs. First, we illustrate how FMs, including reasoning and certification techniques, can help LLMs generate more reliable and formally certified outputs. Subsequently, we highlight how the advanced learning capabilities and adaptability of LLMs can significantly enhance the usability, efficiency, and scalability of existing FM tools. Finally, we show that unifying these two computation paradigms -- integrating the flexibility and intelligence of LLMs with the rigorous reasoning abilities of FMs -- has transformative potential for the development of trustworthy AI software systems. We acknowledge that this integration has the potential to enhance both the trustworthiness and efficiency of software engineering practices while fostering the development of intelligent FM tools capable of addressing complex yet real-world challenges.
翻译:大型语言模型(LLMs)已成为一种变革性的人工智能范式,凭借其卓越的语言理解与上下文生成能力深刻影响着日常生活。尽管性能显著,LLMs仍面临关键挑战:由于其基于学习的内在局限性,易产生不可靠的输出结果。另一方面,形式化方法(FMs)作为一种成熟的计算范式,提供了数学上严谨的系统建模、规约与正确性验证技术。FMs已广泛应用于关键任务软件工程、嵌入式系统和网络安全领域。然而,阻碍FMs在实际场景部署的主要挑战在于其陡峭的学习曲线、缺乏用户友好界面,以及效率与适应性方面的问题。本立场文件通过融合LLMs与FMs的相互增强机制,提出了推进新一代可信人工智能系统发展的路线图。首先,我们阐释包含推理与验证技术在内的FMs如何帮助LLMs生成更可靠且经过形式化验证的输出。随后,我们强调LLMs先进的学习能力与适应性如何显著提升现有FM工具的可用性、效率与可扩展性。最后,我们论证了这两种计算范式的统一——将LLMs的灵活性与智能性同FMs的严谨推理能力相结合——对开发可信人工智能软件系统具有变革性潜力。我们认识到,这种融合不仅有望提升软件工程实践的可信度与效率,同时能促进开发具备应对复杂现实挑战能力的智能FM工具。