In this paper, we introduce DenseControl, a novel pipeline for generating dense crowd images. Specifically, DenseControl meticulously positions and sizes each generated instance to align precisely with the predefined coordinates and scales. Based on this, we further allow for control over the background, style, and attributes of instances. The motivation behind DenseControl stems from the observation of two main challenges in synthesizing crowd images: controlling signal embedding and maintaining topological integrity when imparting instance scale guidance. To address these, we first introduce the Isolated Object Embedding (IOE) map, a novel representation that facilitates spatial location control while mitigating the difficulties associated with learning projections for model. Secondly, we propose an Implicit Scale Embedding (ISE) strategy that seamlessly integrates with the IOE map to encode precise scale information. To further enhance the efficacy of combining ISE with the IOE map, we incorporate a Position Shortcut mechanism that enhances cross-attention to alleviate projection challenges. We evaluate DenseControl through two lenses: synthesis quality and applicability in latent applications. Experiments across different control conditions demonstrate DenseControl achieves state-of-the-art results in dense crowd image synthesis. Furthermore, we showcase applications in augmenting crowd analysis under data scarcity, transfer learning, and weather generalization scenes, to highlight the practical utility of DenseControl. The codebase will be released.
翻译:本文提出DenseControl,一种用于生成密集人群图像的新型流水线。具体而言,DenseControl能够精确定位和调整每个生成实例的尺寸,使其与预设的坐标和尺度严格对齐。在此基础上,我们进一步实现对背景、风格及实例属性的控制。DenseControl的动机源于对合成人群图像时两大主要挑战的观察:控制信号嵌入与在施加实例尺度引导时保持拓扑完整性。为解决这些问题,我们首先引入孤立对象嵌入(IOE)图,该新型表征在促进空间位置控制的同时,缓解了模型学习投影的困难。其次,我们提出隐式尺度嵌入(ISE)策略,该策略可与IOE图无缝集成以编码精确的尺度信息。为增强ISE与IOE图结合的效果,我们引入位置快捷机制,通过增强交叉注意力来减轻投影挑战。我们从合成质量与潜在应用适用性两个维度评估DenseControl。跨不同控制条件的实验表明,DenseControl在密集人群图像合成中取得了最先进的结果。此外,我们展示了其在数据稀缺情况下增强人群分析、迁移学习及天气泛化场景中的应用,以突出DenseControl的实际效用。相关代码库将公开发布。