Wildfires threaten biodiversity, carbon stocks, and management capacity in the Brazilian Cerrado, where Conservation Units and their official buffer zones must allocate prevention resources under a strong dry-season fire regime. This work develops a retrospective daily active-fire detection benchmark for the Cerrado portion of Minas Gerais, Brazil, using INPE BDQueimadas reference satellite labels (AQUA_M-T), constrained pseudo absences with same-year MapBiomas Collection 9 land-cover filtering, and four nested covariate stages extracted through Google Earth Engine. Logistic Regression, Random Forest, and XGBoost are evaluated under five-fold time-series cross-validation on a global training base and on independent imbalanced test sets spatially held out to Parque Estadual do Pau Furado and Parque Estadual da Serra do Cabral with their official buffer zones. AUC-PR is the primary metric, with AUC-ROC, threshold precision and recall, SHAP explanations, and retrospective score maps used as complementary diagnostics. Temporal cross-validation showed the highest mean AUC-PR at the complete temporal-memory stage for all three model families. Held-out AOI tests were weaker under the stricter 1:100 prevalence design: Random Forest peaked at Stage 3 in both AOIs, while XGBoost maps exposed high-recall, high-warning-volume behavior. The resulting baseline provides a reproducible reference for comparing atmospheric, surface, static spatial, and short-term memory covariates in daily CU-scale active-fire detection ranking. Because several stages use same-day covariates, the study is a retrospective classification benchmark rather than a prospective forecast.
翻译:野火威胁着巴西塞拉多(Cerrado)地区的生物多样性、碳储量及管理能力,该地区的保护单元及其官方缓冲带必须在干旱季节严峻的火情频发环境下分配预防资源。本研究利用INPE BDQueimadas参考卫星标签(AQUA_M-T)、基于同年MapBiomas第9版土地覆盖过滤的约束伪缺失数据,以及通过Google Earth Engine提取的四级嵌套协变量,构建了巴西米纳斯吉拉斯州塞拉多部分区域的回顾性每日活跃火点检测基准。在全局训练集和独立不平衡测试集上,对逻辑回归、随机森林和XGBoost模型进行了五折时间序列交叉验证评估,其中测试集通过空间留出法选取自帕乌弗拉多州立公园和塞拉杜卡布拉尔州立公园及其官方缓冲带。采用AUC-PR作为主要指标,并以AUC-ROC、阈值精确率和召回率、SHAP解释及回顾性评分图作为补充诊断工具。时间交叉验证结果显示,所有三个模型族在完整的时序记忆阶段均达到最高平均AUC-PR。在更严格的1:100流行率设计下,留出区域测试表现较弱:随机森林在两个留出区域的第三阶段达到峰值,而XGBoost评分图则呈现出高召回率、高预警量特征。该基线为比较大气、地表、静态空间及短期记忆协变量在每日保护单元尺度活跃火点检测排序中的表现提供了可重复的参考基准。由于部分阶段使用了当日协变量,本研究属于回顾性分类基准而非前瞻性预测。