All-in-One Slider for Attribute Manipulation in Diffusion Models

Text-to-image (T2I) diffusion models have made significant strides in generating high-quality images. However, progressively manipulating certain attributes of generated images to meet the desired user expectations remains challenging, particularly for content with rich details, such as human faces. Some studies have attempted to address this by training slider modules. However, they follow a **One-for-One** manner, where an independent slider is trained for each attribute, requiring additional training whenever a new attribute is introduced. This not only results in parameter redundancy accumulated by sliders but also restricts the flexibility of practical applications and the scalability of attribute manipulation. To address this issue, we introduce the **All-in-On** Slider, a lightweight module that decomposes the text embedding space into sparse, semantically meaningful attribute directions. Once trained, it functions as a general-purpose slider, enabling interpretable and fine-grained continuous control over various attributes. Moreover, by recombining the learned directions, the All-in-One Slider supports the composition of multiple attributes and zero-shot manipulation of unseen attributes (e.g., races and celebrities). Extensive experiments demonstrate that our method enables accurate and scalable attribute manipulation, achieving notable improvements compared to previous methods. Furthermore, our method can be extended to integrate with the inversion framework to perform attribute manipulation on real images, broadening its applicability to various real-world scenarios. The code is available on [our project](https://github.com/ywxsuperstar/ksaedit) page.

翻译：文本到图像（T2I）扩散模型在生成高质量图像方面取得了显著进展。然而，逐步操控生成图像的特定属性以满足用户期望仍具挑战性，尤其是针对包含丰富细节的内容（例如人脸）。部分研究尝试通过训练滑块模块来解决该问题，但这些方法遵循“一对一”模式——为每个属性独立训练滑块，每当引入新属性时便需额外训练。这不仅导致滑块积累参数冗余，还限制了实际应用的灵活性与属性操控的可扩展性。为解决上述问题，我们提出“全能属性操控滑块”（All-in-One Slider），这是一种轻量级模块，能够将文本嵌入空间分解为稀疏且具有语义意义的属性方向。训练完成后，该模块可作为通用滑块，实现对多种属性的可解释且细粒度的连续控制。此外，通过重组已学习的方向，全能滑块可支持多属性组合以及未见属性（如种族、名人）的零样本操控。大量实验表明，该方法能够实现精准且可扩展的属性操控，相较于现有方法取得显著改进。同时，我们的方法可扩展至与反演框架集成，对真实图像执行属性操控，从而拓宽其在各类实际场景中的适用性。相关代码已发布于[项目页面](https://github.com/ywxsuperstar/ksaedit)。