An Empirical Study of World Model Quantization

World models learn an internal representation of environment dynamics, enabling agents to simulate and reason about future states within a compact latent space for tasks such as planning, prediction, and inference. However, running world models rely on hevay computational cost and memory footprint, making model quantization essential for efficient deployment. To date, the effects of post-training quantization (PTQ) on world models remain largely unexamined. In this work, we present a systematic empirical study of world model quantization using DINO-WM as a representative case, evaluating diverse PTQ methods under both weight-only and joint weight-activation settings. We conduct extensive experiments on different visual planning tasks across a wide range of bit-widths, quantization granularities, and planning horizons up to 50 iterations. Our results show that quantization effects in world models extend beyond standard accuracy and bit-width trade-offs: group-wise weight quantization can stabilize low-bit rollouts, activation quantization granularity yields inconsistent benefits, and quantization sensitivity is highly asymmetric between encoder and predictor modules. Moreover, aggressive low-bit quantization significantly degrades the alignment between the planning objective and task success, leading to failures that cannot be remedied by additional optimization. These findings reveal distinct quantization-induced failure modes in world model-based planning and provide practical guidance for deploying quantized world models under strict computational constraints. The code will be available at https://github.com/huawei-noah/noah-research/tree/master/QuantWM.

翻译：世界模型通过学习环境动态的内部表示，使智能体能够在紧凑的潜在空间中模拟和推理未来状态，以执行规划、预测和推理等任务。然而，运行世界模型依赖于高昂的计算成本和内存占用，使得模型量化对于高效部署至关重要。迄今为止，训练后量化（PTQ）对世界模型的影响在很大程度上尚未得到充分研究。在本工作中，我们以DINO-WM为代表案例，对世界模型量化进行了系统的实证研究，评估了仅权重量化以及权重-激活联合量化设置下的多种PTQ方法。我们在不同的视觉规划任务上进行了广泛的实验，涵盖了广泛的比特宽度、量化粒度以及长达50步的规划范围。我们的结果表明，世界模型中的量化效应超越了标准的精度与比特宽度权衡：分组权重量化可以稳定低比特展开过程，激活量化粒度带来的收益不一致，并且编码器与预测器模块之间的量化敏感性高度不对称。此外，激进的低比特量化会显著降低规划目标与任务成功之间的对齐度，导致无法通过额外优化来补救的失败。这些发现揭示了基于世界模型的规划中由量化引发的独特失效模式，并为在严格计算约束下部署量化世界模型提供了实用指导。代码将在https://github.com/huawei-noah/noah-research/tree/master/QuantWM 提供。