Multi-exposure High Dynamic Range (HDR) imaging is a challenging task when facing truncated texture and complex motion. Existing deep learning-based methods have achieved great success by either following the alignment and fusion pipeline or utilizing attention mechanism. However, the large computation cost and inference delay hinder them from deploying on resource limited devices. In this paper, to achieve better efficiency, a novel Selective Alignment Fusion Network (SAFNet) for HDR imaging is proposed. After extracting pyramid features, it jointly refines valuable area masks and cross-exposure motion in selected regions with shared decoders, and then fuses high quality HDR image in an explicit way. This approach can focus the model on finding valuable regions while estimating their easily detectable and meaningful motion. For further detail enhancement, a lightweight refine module is introduced which enjoys privileges from previous optical flow, selection masks and initial prediction. Moreover, to facilitate learning on samples with large motion, a new window partition cropping method is presented during training. Experiments on public and newly developed challenging datasets show that proposed SAFNet not only exceeds previous SOTA competitors quantitatively and qualitatively, but also runs order of magnitude faster. Code and dataset is available at https://github.com/ltkong218/SAFNet.
翻译:多曝光高动态范围(HDR)成像在面临纹理截断和复杂运动时是一项具有挑战性的任务。现有的基于深度学习的方法通过遵循对齐与融合流程或利用注意力机制已取得显著成功。然而,其巨大的计算成本和推理延迟阻碍了它们在资源受限设备上的部署。本文为提升效率,提出了一种新颖的用于HDR成像的选择性对齐融合网络(SAFNet)。该方法在提取金字塔特征后,通过共享解码器联合优化选定区域中的有效区域掩码与跨曝光运动,随后以显式方式融合生成高质量的HDR图像。该策略能使模型专注于发现有效区域,同时估计这些区域中易于检测且具有实际意义的运动。为进一步增强细节,本文引入了一个轻量级优化模块,该模块充分利用先前估计的光流、选择掩码和初始预测结果。此外,为促进模型在大运动样本上的学习,本文在训练阶段提出了一种新的窗口分割裁剪方法。在公开数据集及新构建的挑战性数据集上的实验表明,所提出的SAFNet不仅在定量与定性评价上超越了先前的先进方法,而且运行速度提升了一个数量级。代码与数据集已发布于 https://github.com/ltkong218/SAFNet。