Recent advances in machine learning and AI, including Generative AI and LLMs, are disrupting technological innovation, product development, and society as a whole. AI's contribution to technology can come from multiple approaches that require access to large training data sets and clear performance evaluation criteria, ranging from pattern recognition and classification to generative models. Yet, AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access. Generative AI, in general, and Large Language Models in particular, may represent an opportunity to augment and accelerate the scientific discovery of fundamental deep science with quantitative models. Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery, including self-driven hypothesis generation and open-ended autonomous exploration of the hypothesis space. Integrating AI-driven automation into the practice of science would mitigate current problems, including the replication of findings, systematic production of data, and ultimately democratisation of the scientific process. Realising these possibilities requires a vision for augmented AI coupled with a diversity of AI approaches able to deal with fundamental aspects of causality analysis and model discovery while enabling unbiased search across the space of putative explanations. These advances hold the promise to unleash AI's potential for searching and discovering the fundamental structure of our world beyond what human scientists have been able to achieve. Such a vision would push the boundaries of new fundamental science rather than automatize current workflows and instead open doors for technological innovation to tackle some of the greatest challenges facing humanity today.
翻译:近期机器学习和人工智能的进展,包括生成式AI与大型语言模型,正在颠覆技术创新、产品开发乃至整个社会。AI对技术的贡献可通过多种途径实现,这些途径需要访问大规模训练数据集及明确的性能评估标准,涵盖从模式识别、分类到生成模型等领域。然而,AI对基础科学的贡献相对有限,部分原因在于获取高质量科学实践与模型发现所需的大规模数据集更为困难。生成式AI(尤其是大型语言模型)或可成为增强并加速定量模型驱动的深层基础科学发现的重要契机。本文探索并研究了由AI驱动的自动化闭环科学发现方法的多方面特性,包括自我驱动的假设生成及对假设空间的无界自主探索。将AI驱动的自动化融入科学实践可缓解当前问题,例如研究结果的可重复性、系统化数据生成,并最终推动科学过程的民主化。实现这些可能性需要构建增强型AI的愿景,并融合多元化的AI方法——既能处理因果分析与模型发现的基础性议题,又能对潜在解释空间进行无偏搜索。这些突破有望释放AI在探索并发现世界基础结构方面的潜力,超越人类科学家当前的能力边界。此类愿景将拓展新基础科学的疆域,而非单纯自动化现有工作流程,从而为技术创新开辟道路,以应对人类当今面临的最大挑战。