High-quality scientific illustrations are crucial for effectively communicating complex scientific and technical concepts, yet their manual creation remains a well-recognized bottleneck in both academia and industry. We present FigureBench, the first large-scale benchmark for generating scientific illustrations from long-form scientific texts. It contains 3,300 high-quality scientific text-figure pairs, covering diverse text-to-illustration tasks from scientific papers, surveys, blogs, and textbooks. Moreover, we propose AutoFigure, the first agentic framework that automatically generates high-quality scientific illustrations based on long-form scientific text. Specifically, before rendering the final result, AutoFigure engages in extensive thinking, recombination, and validation to produce a layout that is both structurally sound and aesthetically refined, outputting a scientific illustration that achieves both structural completeness and aesthetic appeal. Leveraging the high-quality data from FigureBench, we conduct extensive experiments to test the performance of AutoFigure against various baseline methods. The results demonstrate that AutoFigure consistently surpasses all baseline methods, producing publication-ready scientific illustrations. The code, dataset and huggingface space are released in https://github.com/ResearAI/AutoFigure.
翻译:高质量的科学插图对于有效传达复杂的科学与技术概念至关重要,但其手动制作在学术界与工业界均被公认为一个显著的瓶颈。我们提出了FigureBench,首个从长篇科学文本生成科学插图的大规模基准。它包含3,300对高质量的科学文本-插图配对,涵盖了来自科学论文、综述、博客及教科书中的多样化文本到插图任务。此外,我们提出了AutoFigure,首个基于长篇科学文本自动生成高质量科学插图的智能体框架。具体而言,在渲染最终结果前,AutoFigure会进行深入的思考、重组与验证,以生成结构合理且美学精良的布局,输出兼具结构完整性与美学吸引力的科学插图。利用FigureBench提供的高质量数据,我们进行了广泛的实验,以测试AutoFigure相对于多种基线方法的性能。结果表明,AutoFigure始终超越所有基线方法,生成可直接用于发表的科学插图。代码、数据集及Hugging Face空间已在https://github.com/ResearAI/AutoFigure发布。