Diffusion models rely on a high-dimensional latent space of initial noise seeds, yet it remains unclear whether this space contains sufficient structure to predict properties of the generated samples, such as their classes. In this work, we investigate the emergence of latent structure through the lens of confidence scores assigned by a pre-trained classifier to generated samples. We show that while the latent space appears largely unstructured when considering all noise realizations, restricting attention to initial noise seeds that produce high-confidence samples reveals pronounced class separability. By comparing class predictability across noise subsets of varying confidence and examining the class separability of the latent space, we find evidence of class-relevant latent structure that becomes observable only under confidence-based filtering. As a practical implication, we discuss how confidence-based filtering enables conditional generation as an alternative to guidance-based methods.
翻译:扩散模型依赖于高维初始噪声种子的潜在空间,然而该空间是否包含足够结构以预测生成样本的属性(如其类别)仍不明确。在本研究中,我们通过预训练分类器对生成样本赋予的置信度分数,探究潜在结构的涌现机制。研究表明:当考虑所有噪声实现时,潜在空间呈现显著非结构化特征;但将注意力集中于产生高置信度样本的初始噪声种子时,可观察到明显的类别可分性。通过比较不同置信度噪声子集的类别可预测性,并检验潜在空间的类别可分性,我们发现了仅在置信度筛选条件下可观测的、与类别相关的潜在结构证据。作为实际应用,本文探讨了置信度筛选如何作为基于引导方法的替代方案实现条件生成。