MMBench: Benchmarking End-to-End Multi-modal DNNs and Understanding Their Hardware-Software Implications

The explosive growth of various types of big data and advances in AI technologies have catalyzed a new type of workloads called multi-modal DNNs. Multi-modal DNNs are capable of interpreting and reasoning about information from multiple modalities, making them more applicable to real-world AI scenarios. In recent research, multi-modal DNNs have outperformed the best uni-modal DNN in a wide range of distributed computing applications from traditional multimedia systems to emerging autonomous edge systems. However, despite their importance and superiority, very limited research attention has been devoted to understand the characteristics of multi-modal DNNs and their implications on current computing software/hardware platforms. Existing benchmarks either target uni-modal DNNs or only focus on the algorithm characteristics of multi-modal DNNs. There lacks representative benchmark suites that provide comprehensive system and architecture level analysis of multi-modal networks. To advance the understanding of these multi-modal DNN workloads and facilitate related research, we present MMBench, an open-source, end-to-end benchmark suite consisting of a set of real-world multi-modal DNN workloads with relevant performance metrics for evaluation. We then use MMBench to conduct an in-depth analysis on the characteristics of multi-modal DNNs. We demonstrate their unique characteristics of clear multi-stage execution, frequent synchronization and high heterogeneity, which distinguish them from conventional uni-modal DNNs. Finally, we conduct a case study and extend our benchmark to edge devices. We hope that our work can provide insights for future software/hardware design and optimization to underpin multi-modal DNNs on both cloud and edge computing platforms.

翻译：各类大数据的爆炸式增长与人工智能技术的进步催生了名为多模态深度神经网络的新型工作负载。多模态深度神经网络能够解释和推理来自多种模态的信息，使其更适用于现实世界的AI场景。最新研究表明，从传统多媒体系统到新兴自主边缘系统等广泛分布式计算应用中，多模态深度神经网络已超越最优单模态深度神经网络的性能。然而，尽管具有重要性和优越性，针对多模态深度神经网络特性及其对当前计算软硬件平台影响的研究仍十分有限。现有基准测试要么聚焦单模态深度神经网络，要么仅关注多模态深度神经网络的算法特性，缺乏提供系统性体系结构与架构级分析的代表性基准测试套件。为深化对多模态深度神经网络工作负载的理解并促进相关研究，我们提出MMBench——一个开源的端到端基准测试套件，包含一组真实多模态深度神经网络工作负载及相应性能评估指标。基于MMBench，我们对多模态深度神经网络的特性进行了深入分析，揭示了其区别于传统单模态深度神经网络的独特特征：清晰的阶段性执行、频繁的同步操作以及高度异构性。最后，我们通过案例研究将基准测试扩展至边缘设备。期望本研究能为云端与边缘计算平台上多模态深度神经网络的软件/硬件设计与优化提供启示。