We present DreamPolisher, a novel Gaussian Splatting based method with geometric guidance, tailored to learn cross-view consistency and intricate detail from textual descriptions. While recent progress on text-to-3D generation methods have been promising, prevailing methods often fail to ensure view-consistency and textural richness. This problem becomes particularly noticeable for methods that work with text input alone. To address this, we propose a two-stage Gaussian Splatting based approach that enforces geometric consistency among views. Initially, a coarse 3D generation undergoes refinement via geometric optimization. Subsequently, we use a ControlNet driven refiner coupled with the geometric consistency term to improve both texture fidelity and overall consistency of the generated 3D asset. Empirical evaluations across diverse textual prompts spanning various object categories demonstrate the efficacy of DreamPolisher in generating consistent and realistic 3D objects, aligning closely with the semantics of the textual instructions.
翻译:我们提出DreamPolisher,一种新颖的基于几何引导的高斯泼溅方法,旨在从文本描述中学习跨视图一致性与精细细节。尽管近期文本到三维生成方法取得了令人鼓舞的进展,但主流方法往往难以确保视图一致性与纹理丰富度,这一问题在仅依赖文本输入的方法中尤为突出。为此,我们提出基于高斯泼溅的两阶段方法,强制实现视图间的几何一致性。首先,通过几何优化对粗粒度三维生成进行细化;随后,我们采用基于ControlNet的细化器,结合几何一致性项,提升生成三维资产的纹理保真度与整体一致性。跨多种物体类别的多样化文本提示实证评估表明,DreamPolisher在生成与文本指令语义高度一致的、连贯且逼真的三维对象方面具有显著效果。