Large language models have transformed AI-assisted software engineering, but current research remains biased toward high-resource languages such as Python, with weaker performance in languages like Rust and OCaml. Since real-world systems are inherently polyglot, robust multilingual code intelligence is crucial. This survey focuses on two key tasks: multilingual code generation from shared natural-language requirements, and multilingual code translation that preserves semantics across languages. It reviews representative methods, benchmarks, and evaluation metrics, and highlights challenges and opportunities for trustworthy cross-language generalization.
翻译:大型语言模型已改变了人工智能辅助软件工程,但当前研究仍然偏向于Python等高资源语言,而在Rust和OCaml等语言上的表现较弱。由于现实系统本质上是多语言的,稳健的多语言代码智能至关重要。本综述聚焦于两个关键任务:基于共享自然语言需求的多语言代码生成,以及跨语言保持语义的多语言代码翻译。文中回顾了代表性方法、基准数据集和评估指标,并指出了实现可信跨语言泛化的挑战与机遇。