In the rapidly advancing field of multi-modal machine learning (MMML), the convergence of multiple data modalities has the potential to reshape various applications. This paper presents a comprehensive overview of the current state, advancements, and challenges of MMML within the sphere of engineering design. The review begins with a deep dive into five fundamental concepts of MMML:multi-modal information representation, fusion, alignment, translation, and co-learning. Following this, we explore the cutting-edge applications of MMML, placing a particular emphasis on tasks pertinent to engineering design, such as cross-modal synthesis, multi-modal prediction, and cross-modal information retrieval. Through this comprehensive overview, we highlight the inherent challenges in adopting MMML in engineering design, and proffer potential directions for future research. To spur on the continued evolution of MMML in engineering design, we advocate for concentrated efforts to construct extensive multi-modal design datasets, develop effective data-driven MMML techniques tailored to design applications, and enhance the scalability and interpretability of MMML models. MMML models, as the next generation of intelligent design tools, hold a promising future to impact how products are designed.
翻译:在多模态机器学习(MMML)这一快速发展的领域中,多种数据模态的融合有望重塑各类应用。本文对工程设计领域内MMML的现状、进展与挑战进行了全面综述。该综述首先深入探讨了MMML的五个基本概念:多模态信息表征、融合、对齐、翻译与协同学习。随后,我们聚焦于MMML的前沿应用,特别强调了与工程设计相关的任务,如跨模态合成、多模态预测和跨模态信息检索。通过这一全面概述,我们指出了在工程设计中采用MMML所固有的挑战,并提出了未来研究的潜在方向。为推动MMML在工程设计中的持续演进,我们倡导集中力量构建大规模多模态设计数据集,开发针对设计应用的有效数据驱动型MMML技术,并提升MMML模型的可扩展性与可解释性。作为下一代智能设计工具,MMML模型在影响产品设计方式方面拥有广阔前景。