Speaking the Language of Science: Toward a General-Purpose Generative Foundation Model for the Natural Sciences

In this report, we present LOGOS (Language Of Generative Objects in Science), a scientific generative language model that unifies heterogeneous tasks across the natural sciences within a single autoregressive framework based on a shared scientific grammar. It encodes diverse scientific objects and their spatial interactions as token sequences over a common vocabulary. By representing spatial contact and constraint patterns as discrete tokens, the model captures complex structural interactions in a purely sequential manner, without relying on explicit coordinates or geometric neural networks. This unified representation enables a wide range of downstream tasks to be formulated consistently as next-token prediction in the same grammar space, creating strong alignment between continued multi-domain pre-training and downstream objectives. Across diverse tasks, LOGOS consistently matches or outperforms domain-specific baselines, providing preliminary evidence for the feasibility of "one model fits all" in the natural sciences. We train LOGOS models at different scales (1B, 3B, and 8B parameters) and find a consistent positive correlation between model size and performance. This suggests that the future of AI for Science (AI4S) may not lie in building an independent technical stack that is separated from large language models (LLMs). Instead, it may depend on deeply aligning scientific foundation models with LLMs through shared architectures, shared training paradigms, and shared inference infrastructure, so that LLMs can truly become a new entry point for AI4S. We release the model weights and associated resources to facilitate further research.

翻译：在本文中，我们提出LOGOS（科学生成对象语言），一种统一的科学生成语言模型。该模型基于共享科学语法，在单一自回归框架内整合了自然科学领域的异构任务。它将多样化的科学对象及其空间相互作用编码为通用词汇表上的令牌序列。通过将空间接触与约束模式表示为离散令牌，模型以纯序列化方式捕获复杂的结构相互作用，无需依赖显式坐标或几何神经网络。这种统一表示使得大量下游任务可被一致地表述为同一语法空间中的下一个令牌预测，从而在持续的跨领域预训练与下游目标之间建立强对齐。在多样化任务中，LOGOS持续达到或超越领域专用基线水平，为自然科学中"一个模型适用于所有"的可行性提供了初步证据。我们训练了不同规模（1B、3B和8B参数）的LOGOS模型，发现模型规模与性能之间存在一致的正相关关系。这表明未来AI for Science（AI4S）可能不在于构建独立于大语言模型的技术栈，而在于通过共享架构、共享训练范式及共享推理基础设施，使科学基础模型与大语言模型深度对齐，从而让大语言模型真正成为AI4S的新入口。我们开源了模型权重及相关资源以促进后续研究。