Accurately capturing dynamic scenes with wide-ranging motion and light intensity is crucial for many vision applications. However, acquiring high-speed high dynamic range (HDR) video is challenging because the camera's frame rate restricts its dynamic range. Existing methods sacrifice speed to acquire multi-exposure frames. Yet, misaligned motion in these frames can still pose complications for HDR fusion algorithms, resulting in artifacts. Instead of frame-based exposures, we sample the videos using individual pixels at varying exposures and phase offsets. Implemented on a monochrome pixel-wise programmable image sensor, our sampling pattern simultaneously captures fast motion at a high dynamic range. We then transform pixel-wise outputs into an HDR video using end-to-end learned weights from deep neural networks, achieving high spatiotemporal resolution with minimized motion blurring. We demonstrate aliasing-free HDR video acquisition at 1000 FPS, resolving fast motion under low-light conditions and against bright backgrounds - both challenging conditions for conventional cameras. By combining the versatility of pixel-wise sampling patterns with the strength of deep neural networks at decoding complex scenes, our method greatly enhances the vision system's adaptability and performance in dynamic conditions.
翻译:精确捕捉兼具大范围运动与光强变化的动态场景对许多视觉应用至关重要。然而,由于相机帧率限制其动态范围,获取高速高动态范围(HDR)视频极具挑战性。现有方法通过牺牲速度来采集多曝光帧,但帧间运动错位仍可能给HDR融合算法带来困难,导致伪影产生。我们摒弃基于帧的曝光方式,转而使用具有不同曝光量与相位偏移的独立像素对视频进行采样。在单色逐像素可编程图像传感器上实现该采样模式后,本方法可在高动态范围下同步捕捉快速运动。通过深度神经网络端到端学习获得的权重,我们将逐像素输出转化为HDR视频,实现了高时空分辨率与最小化的运动模糊。实验表明,本方法能在1000帧/秒的速率下实现无混叠HDR视频采集,可解析低照度环境及明亮背景下的快速运动——这两者均为传统相机的严峻挑战。通过将逐像素采样模式的灵活性与深度神经网络解码复杂场景的强大能力相结合,本方法显著提升了视觉系统在动态条件下的适应性与性能。