Remote sensing question answering (RS-QA) often requires more than direct semantic prediction, especially in large-scale forest scenes where ecological analysis involves multi-step filtering, numerical aggregation, neighborhood reasoning, and verifiable evidence. We introduce ForestHG-Trace, a framework for traceable long-horizon ecological reasoning over forest environments. It represents multimodal NEON forest scenes as ecological hypergraphs, where tree instances, spatial units, semantic groups, and neighborhood relations support higher-order reasoning beyond pairwise scene graphs. An LLM-guided agent then invokes deterministic tools for reading, filtering, expansion, aggregation, comparison, and auditing, producing replayable execution traces and compact evidence records rather than only free-form answers. We further construct ForestTraceQA, an executable benchmark for evaluating ecological QA across diverse task types and reasoning depths. Experiments show that ForestHG-Trace substantially improves answer accuracy and execution faithfulness over single-step baselines and scene-graph agents, while highlighting execution depth as the main bottleneck for long-horizon ecological QA.
翻译:遥感问答(RS-QA)通常不仅需要直接的语义预测,尤其在涉及多步过滤、数值聚合、邻域推理及可验证证据的生态分析的大规模森林场景中。我们提出了ForestHG-Trace——一种面向森林环境的可溯长程生态推理框架。该框架将多模态NEON森林场景表示为生态超图,其中树木实例、空间单元、语义分组及邻域关系支持超越成对场景图的高阶推理。随后,由大语言模型引导的智能体调用确定性工具执行读取、过滤、扩展、聚合、比较与审计操作,生成可重放执行轨迹与紧凑证据记录,而非仅输出自由形式答案。我们进一步构建了ForestTraceQA——一个面向不同任务类型及推理深度的生态问答可执行基准测试。实验表明,ForestHG-Trace在单步基线模型与场景图智能体基础上显著提升了答案准确性与执行忠实度,同时揭示了执行深度是长程生态问答的主要瓶颈。