SCOPE: A Lightweight-training LLM Framework for Air Traffic Control Readback Monitoring

Pilot readback of Air Traffic Control (ATC) voice instructions is a primary safeguard against miscommunication in air transportation. However, readback anomalies remain implicated in approximately 80% of aviation incidents. This vulnerability is further exacerbated by rising traffic volume and elevated cognitive workload, thereby motivating automated readback monitoring by machine. Traditional rule-based and machine learning approaches struggle to generalize across the highly variable and evolving phraseology of air traffic controller-pilot communications. While Large Language Models (LLMs) have opened a new avenue through their strong reasoning and generalization capabilities, existing approaches still face deployment and computational barriers in practice. In this work, we propose Semantic reasoning for Communication via Open-set Plug-in with Examples (SCOPE), a novel lightweight-training LLM framework that advances both the efficiency and accuracy of machine-based ATC readback monitoring. The core idea is to couple a plug-in open-set classifier with a carefully designed in-context learning mechanism on top of a frozen LLM. Extensive experiments on the semi-synthetic communication dataset show that SCOPE attains superior accuracy while delivering the low-latency response required for operational environments. Under a few-shot setting, SCOPE achieves 91.05% accuracy in open-set detection and corrects 96.63% of anomalous readbacks, thereby outperforming the strongest available baselines while providing explanations for its decisions. These findings demonstrate the potential of our framework as a practical pathway toward interpretable and controllable ATC readback monitoring.

翻译：飞行员对空中交通管制语音指令的复诵是航空运输中防止沟通失误的首要保障措施。然而，约80%的航空事故仍涉及复诵异常。这一薄弱环节因交通流量增加及认知负荷升高而进一步加剧，由此催生了机器自动化复诵监控的需求。传统基于规则和机器学习的方法难以适应空中交通管制员-飞行员通信中高度多变且持续演化的术语体系。尽管大语言模型凭借其强大的推理与泛化能力开辟了新途径，现有方法在实践中仍面临部署及计算层面的障碍。本研究提出基于开放集插件与示例的通信语义推理框架（SCOPE），这是一种新颖的轻量训练大语言模型框架，可同步提升机器化空中交通管制复诵监控的效率与精度。其核心思路是将开放集分类器插件与精心设计的上下文学习机制耦合于冻结的大语言模型之上。在半合成通信数据集上的大量实验表明，SCOPE能在实现运行环境所需低延迟响应的同时保持卓越精度。在少样本场景下，SCOPE的开放集检测准确率达91.05%，并能修正96.63%的异常复诵，在提供决策解释的同时全面超越现有最强基线模型。这些发现证实了本框架作为可解释、可管控的空中交通管制复诵监控实践路径的巨大潜力。