We introduce DiffBMP, a scalable and efficient differentiable rendering engine for a collection of bitmap images. Our work addresses a limitation that traditional differentiable renderers are constrained to vector graphics, given that most images in the world are bitmaps. Our core contribution is a highly parallelized rendering pipeline, featuring a custom CUDA implementation for calculating gradients. This system can, for example, optimize the position, rotation, scale, color, and opacity of thousands of bitmap primitives all in under 1 min using a consumer GPU. We employ and validate several techniques to facilitate the optimization: soft rasterization via Gaussian blur, structure-aware initialization, noisy canvas, and specialized losses/heuristics for videos or spatially constrained images. We demonstrate DiffBMP is not just an isolated tool, but a practical one designed to integrate into creative workflows. It supports exporting compositions to a native, layered file format, and the entire framework is publicly accessible via an easy-to-hack Python package.
翻译:我们提出DiffBMP,一种面向位图图像集合的可扩展高效可微渲染引擎。传统可微渲染器受限于矢量图形,但现实中大部分图像均为位图,我们的工作解决了这一局限性。核心贡献是高度并行化的渲染管线,其中包含用于梯度计算的自定义CUDA实现。该系统可在消费级GPU上于1分钟内优化数千个位图图元的位置、旋转、缩放、颜色及透明度等属性。我们采用并验证了多种优化技术:基于高斯模糊的软光栅化、结构感知初始化、噪声画布、以及面向视频或空间受限图像的特化损失函数/启发式方法。实验表明DiffBMP并非孤立工具,而是可融入创意工作流的实用系统——它支持将合成结果导出为原生分层文件格式,整个框架通过易修改的Python包公开发布。