Generative models have recently achieved remarkable success and widespread adoption in society, yet they often struggle to generate realistic and accurate outputs. This challenge extends beyond language and vision into fields like engineering design, where safety-critical engineering standards and non-negotiable physical laws tightly constrain what outputs are considered acceptable. In this work, we introduce a novel training method to guide a generative model toward constraint-satisfying outputs using `negative data' -- examples of what to avoid. Our negative-data generative model (NDGM) formulation easily outperforms classic models, generating 1/6 as many constraint-violating samples using 1/8 as much data in certain problems. It also consistently outperforms other baselines, achieving a balance between constraint satisfaction and distributional similarity that is unsurpassed by any other model in 12 of the 14 problems tested. This widespread superiority is rigorously demonstrated across numerous synthetic tests and real engineering problems, such as ship hull synthesis with hydrodynamic constraints and vehicle design with impact safety constraints. Our benchmarks showcase both the best-in-class performance of our new NDGM formulation and the overall dominance of NDGMs versus classic generative models. We publicly release the code and benchmarks at https://github.com/Lyleregenwetter/NDGMs.
翻译:生成模型近年来取得了显著成功并在社会各领域得到广泛应用,然而其生成结果往往难以同时满足真实性与精确性要求。这一挑战不仅存在于语言与视觉领域,更延伸至工程设计等专业领域——在工程设计中,安全关键性工程标准与不可违背的物理定律严格限定了可接受输出的范围。本研究提出一种创新的训练方法,通过引入"负数据"(即需要规避的示例)来引导生成模型产生满足约束条件的输出。我们提出的负数据生成模型(NDGM)框架在多项测试中显著优于经典模型:在某些问题中仅需1/8的数据量即可将约束违反样本数量降低至1/6。该模型在14个测试问题中的12个问题上持续超越其他基线方法,在约束满足度与分布相似性之间实现了当前最优的平衡。我们通过大量合成测试与真实工程问题(如考虑水动力约束的船体合成、满足碰撞安全约束的车辆设计)严谨验证了该方法的广泛优越性。基准测试既展示了新NDGM框架的顶尖性能,也证明了NDGM相较于经典生成模型的整体优势。相关代码与基准测试已公开发布于https://github.com/Lyleregenwetter/NDGMs。