The increasing availability of high-resolution satellite imagery, together with advances in deep learning, creates new opportunities for forest monitoring workflows. Two central challenges in this domain are pixel-level change detection and semantic change interpretation, particularly for complex forest dynamics. While large language models (LLMs) are increasingly adopted for data exploration, their integration with vision-language models (VLMs) for remote sensing image change interpretation (RSICI) remains underexplored, especially beyond urban environments. This paper introduces Forest-Chat, an LLM-driven agent for forest change analysis, enabling natural language querying across multiple RSICI tasks, including change detection and captioning, object counting, deforestation characterisation, and change reasoning. Forest-Chat builds upon a multi-level change interpretation (MCI) vision-language backbone with LLM-based orchestration, incorporating zero-shot change detection via AnyChange and multimodal LLM-based zero-shot change captioning and refinement. To support adaptation and evaluation in forest environments, we introduce the Forest-Change dataset, comprising bi-temporal satellite imagery, pixel-level change masks, and semantic change captions via human annotation and rule-based methods. Forest-Chat achieves mIoU and BLEU-4 scores of 67.10% and 40.17% on Forest-Change, and 88.13% and 34.41% on LEVIR-MCI-Trees, a tree-focused subset of LEVIR-MCI. In a zero-shot capacity, it achieves 60.15% and 34.00% on Forest-Change, and 47.32% and 18.23% on LEVIR-MCI-Trees. Further experiments demonstrate the value of caption refinement for injecting geographic domain knowledge into supervised captions, and the system's limited label domain transfer onto JL1-CD-Trees. These findings demonstrate that interactive, LLM-driven systems can support accessible and interpretable forest change analysis.
翻译:高分辨率卫星影像的日益普及,以及深度学习的进展,为森林监测工作流程带来了新的机遇。该领域的两个核心挑战是像素级变化检测与语义变化解释,尤其是在复杂的森林动态场景中。尽管大型语言模型(LLM)日益广泛用于数据探索,但其与视觉-语言模型(VLM)在遥感图像变化解释(RSICI)中的集成仍未得到充分探索,尤其是在城市环境之外的场景。本文提出Forest-Chat,一种由LLM驱动的森林变化分析智能体,支持跨多类RSICI任务的自然语言查询,包括变化检测与描述、目标计数、毁林特征描述以及变化推理。Forest-Chat基于多层级变化解释(MCI)视觉-语言骨干网络,结合基于LLM的编排机制,通过AnyChange实现零样本变化检测,并利用多模态LLM实现零样本变化描述与精炼。为支持森林环境下的适配与评估,我们引入Forest-Change数据集,包含双时相卫星影像、像素级变化掩膜,以及通过人工标注与规则方法生成的语义变化描述文本。Forest-Chat在Forest-Change数据集上达到67.10%的平均交并比(mIoU)和40.17%的BLEU-4分数;在LEVIR-MCI-Trees(LEVIR-MCI中专注于树木的子集)上达到88.13%的mIoU和34.41%的BLEU-4分数。在零样本能力方面,其分别在Forest-Change上取得60.15%的mIoU和34.00%的BLEU-4分数,在LEVIR-MCI-Trees上取得47.32%的mIoU和18.23%的BLEU-4分数。进一步实验表明,描述精炼对于向监督式描述文本注入地理领域知识具有价值,同时该系统在JL1-CD-Trees上呈现有限的标签域迁移能力。这些发现表明,交互式、LLM驱动的系统能够支持可访问且可解释的森林变化分析。