Scientific discovery workflows usually contain and rely heavily on lab notes, where researchers record observations, interpret uncertain results, and plan follow-up experiments. Such informative lab notes preserve evolving scientific reasoning and author uncertainty, rather than polished final results exhibited in publications, providing a valuable opportunity for AI to engage in scientific exploration at a more comprehensive and deeper level. However, most prior work on scientific text focuses on papers, protocols, or structured databases, leaving informal laboratory notes underexplored as inputs to AI agents for science. This gap matters because lab notes often intermingle validated observations, tentative judgments, and possible experimental next steps within the same passage. If these signals are conflated, an AI agent may mistake uncertain scientific judgments for confirmed conclusions or executable actions. To this end, we present Notes2Skills, a two-stage framework for turning lab notebooks into verifiable skills for scientific AI agents while preserving the author's certainty. Across seven conditions and three wet-lab sessions, Notes2Skills is the only configuration that neither mistakes uncertain notes for firm instructions nor discards firm ones. We show that certainty preservation is the missing piece between lab notebooks and reliable agent skills, opening a path toward safer AI co-scientist systems.
翻译:科学发现工作流通常包含并高度依赖实验记录,研究人员在其中记录观测结果、解释不确定性发现并规划后续实验。这类蕴含丰富信息的实验记录保留了不断演化的科学推理过程和作者的不确定性,而非论文中展示的经过修饰的最终成果,为人工智能更全面、更深入地参与科学探索提供了宝贵机会。然而,此前大多数关于科学文本的研究集中于论文、实验方案或结构化数据库,非正式的实验记录作为人工智能科学智能体的输入来源尚未得到充分探索。这一空白之所以关键,是因为实验记录常常在同一段落中混合已验证的观测结果、暂定判断和可能的实验后续步骤。若这些信号被混淆,人工智能智能体可能将不确定的科学判断误认为已确认的结论或可执行的操作。为此,我们提出了Notes2Skills,一个两阶段框架,用以在保留作者置信度的同时,将实验记录转化为科学人工智能智能体可验证的技能。在七种条件和三次湿实验室实验中,Notes2Skills是唯一一种既不会将不确定记录误认为明确指令、也不会丢弃明确指令的配置方案。我们证明,置信度的保留是连接实验记录与可靠智能体技能的关键缺失环节,这为构建更安全的人工智能联合科学家系统开辟了道路。