The primary value of infrared and visible image fusion technology lies in applying the fusion results to downstream tasks. However, existing methods face challenges such as increased training complexity and significantly compromised performance of individual tasks when addressing multiple downstream tasks simultaneously. To tackle this, we propose Task-Oriented Adaptive Regulation (T-OAR), an adaptive mechanism specifically designed for multi-task environments. Additionally, we introduce the Task-related Dynamic Prompt Injection (T-DPI) module, which generates task-specific dynamic prompts from user-input text instructions and integrates them into target representations. This guides the feature extraction module to produce representations that are more closely aligned with the specific requirements of downstream tasks. By incorporating the T-DPI module into the T-OAR framework, our approach generates fusion images tailored to task-specific requirements without the need for separate training or task-specific weights. This not only reduces computational costs but also enhances adaptability and performance across multiple tasks. Experimental results show that our method excels in object detection, semantic segmentation, and salient object detection, demonstrating its strong adaptability, flexibility, and task specificity. This provides an efficient solution for image fusion in multi-task environments, highlighting the technology's potential across diverse applications.
翻译:红外与可见光图像融合技术的主要价值在于将融合结果应用于下游任务。然而,现有方法在同时处理多个下游任务时面临训练复杂度增加、各任务性能显著受损等挑战。为此,我们提出任务导向自适应调节机制,这是一种专为多任务环境设计的自适应机制。此外,我们引入了任务相关动态提示注入模块,该模块能够根据用户输入的文本指令生成任务特定的动态提示,并将其整合到目标表征中。这引导特征提取模块生成更贴合下游任务特定需求的表征。通过将任务相关动态提示注入模块集成到任务导向自适应调节框架中,我们的方法能够生成符合任务特定需求的融合图像,而无需进行单独训练或使用任务专用权重。这不仅降低了计算成本,还提升了多任务场景下的适应性与性能。实验结果表明,我们的方法在目标检测、语义分割和显著目标检测任务中均表现优异,展现出强大的适应性、灵活性和任务特异性。这为多任务环境下的图像融合提供了高效解决方案,凸显了该技术在不同应用场景中的潜力。