The generative modeling landscape has experienced tremendous growth in recent years, particularly in generating natural images and art. Recent techniques have shown impressive potential in creating complex visual compositions while delivering impressive realism and quality. However, state-of-the-art methods have been focusing on the narrow domain of natural images, while other distributions remain unexplored. In this paper, we introduce the problem of text-to-figure generation, that is creating scientific figures of papers from text descriptions. We present FigGen, a diffusion-based approach for text-to-figure as well as the main challenges of the proposed task. Code and models are available at https://github.com/joanrod/figure-diffusion
翻译:生成建模领域近年来经历了迅猛发展,尤其在自然图像和艺术生成方面。最新技术在创建复杂视觉构成的同时,展现出令人瞩目的真实感与高质量潜力。然而,现有先进方法主要聚焦于自然图像这一狭窄领域,而其他数据分布仍未得到充分探索。本文提出了文本到图表生成这一新问题,即根据文本描述生成学术论文中的科学图表。我们提出了基于扩散模型的FigGen方法,并阐述了该任务面临的主要挑战。代码和模型可访问 https://github.com/joanrod/figure-diffusion 获取。