Reliable surface completion from sparse point clouds underpins many applications spanning content creation and robotics. While 3D diffusion transformers attain state-of-the-art results on this task, we uncover that they exhibit a catastrophic mode of failure: arbitrarily small on-surface perturbations to the input point cloud can fracture the output into multiple disconnected pieces -- a phenomenon we call Meltdown. Using activation-patching from mechanistic interpretability, we localize Meltdown to a single early denoising cross-attention activation. We find that the singular-value spectrum of this activation provides a scalar proxy: its spectral entropy rises when fragmentation occurs and returns to baseline when patched. Interpreted through diffusion dynamics, we show that this proxy tracks a symmetry-breaking bifurcation of the reverse process. Guided by this insight, we introduce PowerRemap, a test-time control that stabilizes sparse point-cloud conditioning. We demonstrate that Meltdown persists across state-of-the-art architectures (WaLa, Make-a-Shape), datasets (GSO, SimJEB) and denoising strategies (DDPM, DDIM), and that PowerRemap effectively counters this failure with stabilization rates of up to 98.3%. Overall, this work is a case study on how diffusion model behavior can be understood and guided based on mechanistic analysis, linking a circuit-level cross-attention mechanism to diffusion-dynamics accounts of trajectory bifurcations.
翻译:从稀疏点云实现可靠的表面补全是内容创作与机器人技术等众多应用的基础。尽管三维扩散Transformer在此任务上取得了最先进的结果,但我们发现其存在一种灾难性失效模式:对输入点云施加任意小的表面扰动,即可导致输出断裂为多个互不连通的碎片——我们将此现象称为“崩溃”。借助机制可解释性中的激活修补技术,我们将崩溃现象定位至一个早期去噪交叉注意力激活。我们发现该激活的奇异值谱提供了一个标量代理指标:其谱熵在发生断裂时上升,而在修补后恢复至基线水平。通过扩散动力学的视角进行解释,我们证明该代理指标追踪了逆向过程中的对称破缺分岔。基于此洞见,我们提出了PowerRemap——一种可稳定稀疏点云条件约束的测试时控制方法。我们证实崩溃现象在多种最先进架构(WaLa、Make-a-Shape)、数据集(GSO、SimJEB)及去噪策略(DDPM、DDIM)中普遍存在,而PowerRemap能有效应对此类失效,稳定率最高可达98.3%。总体而言,本研究是通过机制分析理解与引导扩散模型行为的典型案例,将电路层面的交叉注意力机制与轨迹分岔的扩散动力学解释联系起来。