Accurate forecasting of air pollution is important for environmental monitoring and policy support, yet data-driven models often suffer from limited generalization in regions with sparse observations. This paper presents Meteorology-Driven GPT for Air Pollution (GPT4AP), a parameter-efficient multi-task forecasting framework based on a pre-trained GPT-2 backbone and Gaussian rank-stabilized low-rank adaptation (rsLoRA). The model freezes the self-attention and feed-forward layers and adapts lightweight positional and output modules, substantially reducing the number of trainable parameters. GPT4AP is evaluated on six real-world air quality monitoring datasets under few-shot, zero-shot, and long-term forecasting settings. In the few-shot regime using 10% of the training data, GPT4AP achieves an average MSE/MAE of 0.686/0.442, outperforming DLinear (0.728/0.530) and ETSformer (0.734/0.505). In zero-shot cross-station transfer, the proposed model attains an average MSE/MAE of 0.529/0.403, demonstrating improved generalization compared with existing baselines. In long-term forecasting with full training data, GPT4AP remains competitive, achieving an average MAE of 0.429, while specialized time-series models show slightly lower errors. These results indicate that GPT4AP provides a data-efficient forecasting approach that performs robustly under limited supervision and domain shift, while maintaining competitive accuracy in data-rich settings.
翻译:空气污染的准确预测对于环境监测和政策支持至关重要,然而数据驱动模型在观测稀疏的区域往往泛化能力有限。本文提出气象驱动的大气污染GPT(GPT4AP),一种基于预训练GPT-2骨架和高斯秩稳定低秩自适应(rsLoRA)的参数高效多任务预测框架。该模型冻结自注意力层和前馈层,仅调整轻量级位置模块和输出模块,从而大幅减少可训练参数量。GPT4AP在六个真实空气质量监测数据集上进行了少样本、零样本和长期预测场景的评估。在仅使用10%训练数据的少样本模式下,GPT4AP的平均MSE/MAE达到0.686/0.442,优于DLinear(0.728/0.530)和ETSformer(0.734/0.505)。在跨站点零样本迁移中,该模型实现了0.529/0.403的平均MSE/MAE,相比现有基准模型展现出更强的泛化能力。在使用全量训练数据的长期预测中,GPT4AP保持竞争力,平均MAE达0.429,而专用时间序列模型表现略低。这些结果表明,GPT4AP提供了一种数据高效的预测方法,在有限监督和领域偏移条件下表现稳健,同时在数据丰富场景中保持竞争性精度。