Foundation models (FMs) emerge as a promising solution to harness distributed and diverse environmental data by leveraging prior knowledge to understand the complicated temporal and spatial correlations within heterogeneous datasets. Unlike distributed learning frameworks such as federated learning, which often struggle with multimodal data, FMs can transform diverse inputs into embeddings. This process facilitates the integration of information from various modalities and the application of prior learning to new domains. However, deploying FMs in resource-constrained edge systems poses significant challenges. To this end, we introduce CoRAST, a novel learning framework that utilizes FMs for enhanced analysis of distributed, correlated heterogeneous data. Utilizing a server-based FM, CoRAST can exploit existing environment information to extract temporal, spatial, and cross-modal correlations among sensor data. This enables CoRAST to offer context-aware insights for localized client tasks through FM-powered global representation learning. Our evaluation on real-world weather dataset demonstrates CoRAST's ability to exploit correlated heterogeneous data through environmental representation learning to reduce the forecast errors by up to 50.3% compared to the baselines.
翻译:基座模型凭借先验知识理解异构数据集中的复杂时空关联,成为利用分布式多样化环境数据的重要解决方案。与联邦学习等常难以处理多模态数据的分布式学习框架不同,基座模型可将异构输入转化为嵌入表征,从而促进多模态信息融合及先验知识向新领域的迁移应用。然而,在资源受限的边缘系统中部署基座模型面临重大挑战。为此,我们提出CoRAST——一种利用基座模型增强分布式相关异构数据分析的新型学习框架。通过基于服务器的基座模型,CoRAST可利用现有环境信息提取传感器数据间的时序、空间及跨模态关联。这使其能够通过基座模型驱动的全局表征学习,为本地化客户端任务提供上下文感知洞察。我们在真实气象数据集上的评估表明,CoRAST能通过环境表征学习挖掘相关异构数据,相较于基线方法最多可减少50.3%的预测误差。