Low-cost air pollution sensor networks are increasingly being deployed globally, supplementing sparse regulatory monitoring with localized air quality data. In some areas, like Baltimore, Maryland, there are only few regulatory (reference) devices but multiple low-cost networks. While there are many available methods to calibrate data from each network individually, separate calibration of each network leads to conflicting air quality predictions. We develop a general Bayesian spatial filtering model combining data from multiple networks and reference devices, providing dynamic calibrations (informed by the latest reference data) and unified predictions (combining information from all available sensors) for the entire region. This method accounts for network-specific bias and noise (observation models), as different networks can use different types of sensors, and uses a Gaussian process (state-space model) to capture spatial correlations. We apply the method to calibrate PM$_{2.5}$ data from Baltimore in June and July 2023 -- a period including days of hazardous concentrations due to wildfire smoke. Our method helps mitigate the effects of preferential sampling of one network in Baltimore, results in better predictions and narrower confidence intervals. Our approach can be used to calibrate low-cost air pollution sensor data in Baltimore and any other areas with multiple low-cost networks.
翻译:低成本空气污染传感器网络正在全球范围内日益普及,通过本地化空气质量数据补充了稀疏的监管监测网络。在诸如马里兰州巴尔的摩等地区,监管(参考)设备数量有限,但存在多个低成本监测网络。尽管已有多种方法可对单个网络的数据进行独立校准,但分别校准各网络会导致空气质量预测结果相互冲突。我们开发了一个通用的贝叶斯空间滤波模型,该模型整合了来自多个网络及参考设备的数据,为整个区域提供动态校准(依据最新的参考数据)和统一预测(融合所有可用传感器的信息)。该方法考虑了网络特定的偏差与噪声(观测模型),因为不同网络可能使用不同类型的传感器,并采用高斯过程(状态空间模型)来捕捉空间相关性。我们将此方法应用于校准巴尔的摩地区2023年6月至7月的PM$_{2.5}$数据——该时段包含因野火烟雾导致浓度达到危险级别的数日。我们的方法有助于缓解巴尔的摩某一网络优先采样带来的影响,从而获得更优的预测结果和更窄的置信区间。该方法可用于校准巴尔的摩及任何其他存在多低成本监测网络区域的低成本空气污染传感器数据。