Glass surfaces create complex interactions of reflected and transmitted light, making single-image reflection removal (SIRR) challenging. Existing datasets suffer from limited physical realism in synthetic data or insufficient scale in real captures. We introduce a synthetic dataset generation framework that path-traces 3D glass models over real background imagery to create physically accurate reflection scenarios with varied glass properties, camera settings, and post-processing effects. To leverage the capabilities of Large Multimodal Model (LMM), we concatenate the image layers into a single composite input, apply joint captioning, and fine-tune the model using task-specific LoRA rather than full-parameter training. This enables our approach to achieve improved reflection removal and separation performance compared to state-of-the-art methods.
翻译:玻璃表面会产生反射光与透射光的复杂相互作用,使得单幅图像反射消除任务极具挑战性。现有数据集在合成数据的物理真实性或真实采集数据的规模方面均存在不足。本文提出一种合成数据集生成框架,通过对真实背景图像中的三维玻璃模型进行路径追踪,创建出具有不同玻璃属性、相机设置及后处理效果的物理精确反射场景。为充分发挥大型多模态模型的潜力,我们将图像层拼接为复合输入,实施联合描述生成,并采用任务特定的LoRA进行微调而非全参数训练。相较于现有最优方法,本方法在反射消除与分离性能方面均取得了显著提升。