In the rapidly evolving landscape of artificial intelligence, multimodal learning systems (MMLS) have gained traction for their ability to process and integrate information from diverse modality inputs. Their expanding use in vital sectors such as healthcare has made safety assurance a critical concern. However, the absence of systematic research into their safety is a significant barrier to progress in this field. To bridge the gap, we present the first taxonomy that systematically categorizes and assesses MMLS safety. This taxonomy is structured around four fundamental pillars that are critical to ensuring the safety of MMLS: robustness, alignment, monitoring, and controllability. Leveraging this taxonomy, we review existing methodologies, benchmarks, and the current state of research, while also pinpointing the principal limitations and gaps in knowledge. Finally, we discuss unique challenges in MMLS safety. In illuminating these challenges, we aim to pave the way for future research, proposing potential directions that could lead to significant advancements in the safety protocols of MMLS.
翻译:在人工智能快速发展的背景下,多模态学习系统因其处理与整合来自不同模态输入信息的能力而受到广泛关注。其在医疗保健等关键领域日益广泛的应用,使得安全性保障成为一个至关重要的议题。然而,目前缺乏对其安全性的系统性研究,这是该领域取得进展的一个重大障碍。为弥补这一空白,我们提出了首个系统性地分类和评估多模态学习系统安全性的分类法。该分类法围绕确保多模态学习系统安全性的四个关键基本支柱构建:鲁棒性、对齐性、监控性和可控性。基于此分类法,我们回顾了现有的方法、基准以及当前的研究现状,同时指出了主要的知识局限与空白。最后,我们讨论了多模态学习系统安全性面临的独特挑战。通过阐明这些挑战,我们旨在为未来的研究铺平道路,提出可能引领多模态学习系统安全协议取得重大进展的潜在方向。