In the growing domain of scientific machine learning, in-context operator learning has demonstrated notable potential in learning operators from prompted data during inference stage without weight updates. However, the current model's overdependence on sensor data, may inadvertently overlook the invaluable human insight into the operator. To address this, we present a transformation of in-context operator learning into a multi-modal paradigm. We propose the use of "captions" to integrate human knowledge about the operator, expressed through natural language descriptions and equations. We illustrate how this method not only broadens the flexibility and generality of physics-informed learning, but also significantly boosts learning performance and reduces data needs. Furthermore, we introduce a more efficient neural network architecture for multi-modal in-context operator learning, referred to as "ICON-LM", based on a language-model-like architecture. We demonstrate the viability of "ICON-LM" for scientific machine learning tasks, which creates a new path for the application of language models.
翻译:在日益发展的科学机器学习领域,上下文算子学习已展现出显著潜力,能够在推理阶段无需权重更新即可从提示数据中学习算子。然而,当前模型过度依赖传感器数据,可能无意中忽视了人类对算子的宝贵洞察。为应对这一问题,我们将上下文算子学习转变为多模态范式。我们提出使用“标题”来整合人类关于算子的知识,并通过自然语言描述和方程进行表达。我们展示了该方法不仅拓宽了物理信息学习的灵活性与通用性,还显著提升了学习性能并降低了对数据的需求。此外,我们引入了一种基于类语言模型架构的高效神经网络结构——ICON-LM,用于多模态上下文算子学习。我们验证了ICON-LM在科学机器学习任务中的可行性,这为语言模型的应用开辟了一条新路径。