Large Language Models (LLMs) offer a promising interface for intent-driven control of autonomous cyber-physical systems, but their direct use in mission-critical Internet of Battlefield Things (IoBT) environments raises significant safety, reliability, and policy-compliance concerns. This paper presents a Policy-Aware Large Language Model Retrieval-Augmented Generation (referred as PA-LLM-RAG), an edge-deployed LLM orchestration framework for IoBT mission control that integrates retrieval-augmented reasoning and independent command verification. The proposed PA-LLM-RAG framework combines a lightweight retrieval module that grounds decisions in operational policies and telemetry with a locally hosted LLM for mission planning and a secondary JudgeLLM for validating user generated commands prior to execution. To evaluate PA-LLM-RAG, we implement a simulated IoBT environment using RoboDK and assess four open-source LLMs across controlled mission scenarios of increasing complexity, including baseline operations, threat detection, coverage recovery, multi-event coordination, and policy-violation requests. Experimental results demonstrate that the framework effectively detects policy-violating commands while maintaining low-latency response suitable for edge deployment. Gemma-2B achieving the highest overall reliability with 4.17 sec latency and 100% success rate. The findings highlight a clear tradeoff between reasoning capacity and responsiveness across models and show that combining deterministic safeguards with JudgeLLM verification significantly improves reliability in LLM-driven IoBT orchestration.
翻译:大语言模型为意图驱动的自主信息物理系统控制提供了有前景的接口,但在任务关键型战场物联网环境中直接使用会引发重大安全、可靠性和策略合规性问题。本文提出一种面向策略的大语言模型检索增强生成框架,该边缘部署的LLM编排框架专为战场物联网任务控制设计,融合了检索增强推理与独立指令验证机制。所提出的PA-LLM-RAG框架包含:将决策锚定于操作策略和遥测数据的轻量级检索模块、用于任务规划的本地部署LLM,以及执行前验证用户生成指令的辅助JudgeLLM。为评估该框架,我们利用RoboDK构建仿真战场物联网环境,在复杂度递增的受控任务场景(包括基线操作、威胁检测、覆盖恢复、多事件协调及策略违反请求)中对四种开源LLM进行测试。实验结果表明,该框架能有效检测违反策略的指令,同时保持适合边缘部署的低延迟响应。其中Gemma-2B在4.17秒延迟和100%成功率下实现最高整体可靠性。研究结果揭示了各模型在推理能力与响应速度之间的显著权衡,并表明将确定性防护机制与JudgeLLM验证相结合能显著提升LLM驱动的战场物联网编排的可靠性。