Large-scale livestock operations pose significant risks to human health and the environment, while also being vulnerable to threats such as infectious diseases and extreme weather events. As the number of such operations continues to grow, accurate and scalable mapping has become increasingly important. In this work, we present an infrastructure-first, explainable pipeline for identifying and characterizing Concentrated Animal Feeding Operations (CAFOs) from aerial and satellite imagery. Our method (1) detects candidate infrastructure (e.g., barns, feedlots, manure lagoons, silos) with a domain-tuned YOLOv8 detector, then derives SAM2 masks from these boxes and filters component-specific criteria, (2) extracts structured descriptors (e.g., counts, areas, orientations, and spatial relations) and fuses them with deep visual features using a lightweight spatial cross-attention classifier, and (3) outputs both CAFO type predictions and mask-level attributions that link decisions to visible infrastructure. Through comprehensive evaluation, we show that our approach achieves state-of-the-art performance, with Swin-B+PRISM-CAFO surpassing the best performing baseline by up to 15\%. Beyond strong predictive performance across diverse U.S. regions, we run systematic gradient--activation analyses that quantify the impact of domain priors and show ho
翻译:大规模畜牧养殖活动对人类健康和环境构成重大风险,同时其自身也易受传染病和极端天气事件等威胁。随着此类养殖场数量持续增长,精确且可扩展的制图技术变得日益重要。本研究提出一种以基础设施为先导、可解释的流程,用于从航空与卫星影像中识别和表征集中动物饲养场。我们的方法包含三个步骤:(1) 通过领域调优的YOLOv8检测器识别候选基础设施(如畜舍、饲养场、粪污池、筒仓),并基于检测框生成SAM2掩码,再通过组件特定标准进行过滤;(2) 提取结构化描述符(如数量、面积、朝向及空间关系),并通过轻量级空间交叉注意力分类器将其与深层视觉特征融合;(3) 同步输出CAFO类型预测结果及掩码级归因分析,将分类决策与可见基础设施相关联。综合评估表明,该方法实现了最先进的性能,其中Swin-B+PRISM-CAFO模型较最佳基线性能提升高达15%。除在美国多区域均展现出强劲的预测性能外,我们通过系统性梯度-激活分析量化了领域先验知识的影响,并展示了...