GUIDE：面向研究思路的可扩展建议系统 (GUIDE: Towards Scalable Advising for Research Ideas)

The field of AI research is advancing at an unprecedented pace, enabling automated hypothesis generation and experimental design across diverse domains such as biology, mathematics, and artificial intelligence. Despite these advancements, there remains a significant gap in the availability of scalable advising systems capable of providing high-quality, well-reasoned feedback to refine proposed hypotheses and experimental designs. To address this challenge, we explore key factors that underlie the development of robust advising systems, including model size, context length, confidence estimation, and structured reasoning processes. Our findings reveal that a relatively small model, when equipped with a well-compressed literature database and a structured reasoning framework, can outperform powerful general-purpose language models such as Deepseek-R1 in terms of acceptance rates for self-ranked top-30% submissions to ICLR 2025. Moreover, when limited to high-confidence predictions, our system achieves an acceptance rate exceeding 90% on the ICLR 2025 test set, underscoring its potential to significantly enhance the quality and efficiency of hypothesis generation and experimental design. The code is released at https://github.com/HowardLiu0830/GUIDE-Research-Idea-Evaluation.

翻译：人工智能研究领域正以前所未有的速度发展，使得自动化假设生成与实验设计在生物学、数学和人工智能等多个领域成为可能。尽管取得了这些进展，能够提供高质量、推理严谨的反馈以完善假设与实验设计的可扩展建议系统仍存在显著缺口。为应对这一挑战，我们探究了构建稳健建议系统的关键因素，包括模型规模、上下文长度、置信度估计与结构化推理过程。我们的研究结果表明，当配备经过高效压缩的文献数据库和结构化推理框架时，一个相对较小的模型在国际学习表征会议2025年自评前30%投稿的接收率上，能够超越Deepseek-R1等强大的通用语言模型。此外，当仅限高置信度预测时，我们的系统在国际学习表征会议2025测试集上的接收率超过90%，彰显了其显著提升假设生成与实验设计质量与效率的潜力。代码发布于https://github.com/HowardLiu0830/GUIDE-Research-Idea-Evaluation。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日