The rapid advancement of Low-Altitude Economy Networks (LAENets) has enabled a variety of applications, including aerial surveillance, environmental sensing, and semantic data collection. To support these scenarios, unmanned aerial vehicles (UAVs) equipped with onboard vision-language models (VLMs) offer a promising solution for real-time multimodal inference. However, ensuring both inference accuracy and communication efficiency remains a significant challenge due to limited onboard resources and dynamic network conditions. In this paper, we first propose a UAV-enabled LAENet system model that jointly captures UAV mobility, user-UAV communication, and the onboard visual question answering (VQA) pipeline. Based on this model, we formulate a mixed-integer non-convex optimization problem to minimize task latency and power consumption under user-specific accuracy constraints. To solve the problem, we design a hierarchical optimization framework composed of two parts: (i) an Alternating Resolution and Power Optimization (ARPO) algorithm for resource allocation under accuracy constraints, and (ii) a Large Language Model-augmented Reinforcement Learning Approach (LLaRA) for adaptive UAV trajectory optimization. The large language model (LLM) serves as an expert in refining reward design of reinforcement learning in an offline fashion, introducing no additional latency in real-time decision-making. Numerical results demonstrate the efficacy of our proposed framework in improving inference performance and communication efficiency under dynamic LAENet conditions.
翻译:低空经济网络的快速发展催生了多种应用场景,包括空中监视、环境感知和语义数据采集。为支持这些场景,搭载机载视觉语言模型的无人机为实时多模态推理提供了一种前景广阔的解决方案。然而,由于机载资源有限和网络条件动态变化,同时确保推理精度与通信效率仍是一项重大挑战。本文首先提出一种无人机使能的低空经济网络系统模型,该模型联合刻画了无人机移动性、用户-无人机通信以及机载视觉问答处理流程。基于此模型,我们构建了一个混合整数非凸优化问题,旨在满足用户特定精度约束下最小化任务延迟与功耗。为解决该问题,我们设计了一个分层优化框架,包含两部分:(i) 一种在精度约束下进行资源分配的交替分辨率与功率优化算法;(ii) 一种用于自适应无人机轨迹优化的大语言模型增强强化学习方法。大语言模型作为专家以离线方式优化强化学习的奖励设计,在实时决策中不引入额外延迟。数值结果表明,所提框架在动态低空经济网络条件下能有效提升推理性能与通信效率。