Diffusion models do not recover semantic structure uniformly over time. Instead, samples transition from semantic ambiguity to class commitment within a narrow regime. Recent theoretical work attributes this transition to dynamical instabilities along class-separating directions, but practical methods to detect and exploit these windows in trained models are still limited. We show that tracking the class-conditional entropy of a latent semantic variable given the noisy state provides a reliable signature of these transition regimes. By restricting the entropy to semantic partitions, the entropy can furthermore resolve semantic decisions at different levels of abstraction. We analyze this behavior in high-dimensional Gaussian mixture models and show that the entropy rate concentrates on the same logarithmic time scale as the speciation symmetry-breaking instability previously identified in variance-preserving diffusion. We validate our method on EDM2-XS and Stable Diffusion 1.5, where class-conditional entropy consistently isolates the noise regimes critical for semantic structure formation. Finally, we use our framework to quantify how guidance redistributes semantic information over time. Together, these results connect information-theoretic and statistical physics perspectives on diffusion and provide a principled basis for time-localized control.
翻译:扩散模型并非随时间均匀地恢复语义结构。相反,样本在狭窄的时间窗口内从语义模糊性过渡到类别承诺。最近的理论工作将这种过渡归因于沿类别分离方向的动态不稳定性,但检测和利用已训练模型中这些窗口的实用方法仍然有限。我们证明,跟踪给定噪声状态下潜在语义变量的类别条件熵可以提供这些过渡窗口的可靠特征。通过将熵限制在语义划分上,熵还能进一步解析不同抽象层次上的语义决策。我们在高维高斯混合模型中分析了这一行为,并表明熵率集中在对数时间尺度上,这与之前识别出的方差保持扩散中的物种形成对称性破缺不稳定性相同。我们在EDM2-XS和Stable Diffusion 1.5上验证了我们的方法,其中类别条件熵一致地隔离了语义结构形成关键噪声区域。最后,我们利用框架量化了引导机制如何随时间重新分配语义信息。总之,这些结果将扩散模型的信息论与统计物理视角联系起来,并为时间局部控制提供了理论基础。