Large Chunk Test-Time Training (LaCT) has shown strong performance on long-context 3D reconstruction, but its fully plastic inference-time updates remain vulnerable to catastrophic forgetting and overfitting. As a result, LaCT is typically instantiated with a single large chunk spanning the full input sequence, falling short of the broader goal of handling arbitrarily long sequences in a single pass. We propose Elastic Test-Time Training inspired by elastic weight consolidation, that stabilizes LaCT fast-weight updates with a Fisher-weighted elastic prior around a maintained anchor state. The anchor evolves as an exponential moving average of past fast weights to balance stability and plasticity. Based on this updated architecture, we introduce Fast Spatial Memory (FSM), an efficient and scalable model for 4D reconstruction that learns spatiotemporal representations from long observation sequences and renders novel view-time combinations. We pre-trained FSM on large-scale curated 3D/4D data to capture the dynamics and semantics of complex spatial environments. Extensive experiments show that FSM supports fast adaptation over long sequences and delivers high-quality 3D/4D reconstruction with smaller chunks and mitigating the camera-interpolation shortcut. Overall, we hope to advance LaCT beyond the bounded single-chunk setting toward robust multi-chunk adaptation, a necessary step for generalization to genuinely longer sequences, while substantially alleviating the activation-memory bottleneck.
翻译:大块测试时训练(LaCT)在长上下文三维重建中展现出优异性能,但其全塑性推理时更新仍易受灾难性遗忘与过拟合影响。因此,LaCT通常采用涵盖完整输入序列的单一大型数据块进行实例化,未能实现单次处理任意长序列的广泛目标。受弹性权重巩固启发,我们提出弹性测试时训练,通过围绕维持的锚点状态构建Fisher加权弹性先验,稳定LaCT快速权重更新。该锚点以快速权重的指数移动平均形式演进,以平衡稳定性与可塑性。基于此更新架构,我们引入快速空间记忆(FSM)——一种高效可扩展的四维重建模型,能从长观测序列中学习时空表征并渲染新型视角-时间组合。我们在大规模精选3D/4D数据上预训练FSM,以捕捉复杂空间环境的动态性与语义信息。大量实验表明,FSM能支持长序列快速自适应,通过更小型数据块实现高质量3D/4D重建,并有效缓解相机插值捷径。总体而言,我们希望推动LaCT突破有限单数据块设定,向稳健多数据块自适应迈进——这是泛化至真正更长序列的必要步骤,同时显著缓解激活存储瓶颈。