MMBench: Benchmarking End-to-End Multi-modal DNNs and Understanding Their Hardware-Software Implications

The explosive growth of various types of big data and advances in AI technologies have catalyzed a new type of workloads called multi-modal DNNs. Multi-modal DNNs are capable of interpreting and reasoning about information from multiple modalities, making them more applicable to real-world AI scenarios. In recent research, multi-modal DNNs have outperformed the best uni-modal DNN in a wide range of distributed computing applications from traditional multimedia systems to emerging autonomous edge systems. However, despite their importance and superiority, very limited research attention has been devoted to understand the characteristics of multi-modal DNNs and their implications on current computing software/hardware platforms. Existing benchmarks either target uni-modal DNNs or only focus on the algorithm characteristics of multi-modal DNNs. There lacks representative benchmark suites that provide comprehensive system and architecture level analysis of multi-modal networks. To advance the understanding of these multi-modal DNN workloads and facilitate related research, we present MMBench, an open-source, end-to-end benchmark suite consisting of a set of real-world multi-modal DNN workloads with relevant performance metrics for evaluation. We then use MMBench to conduct an in-depth analysis on the characteristics of multi-modal DNNs. We demonstrate their unique characteristics of clear multi-stage execution, frequent synchronization and high heterogeneity, which distinguish them from conventional uni-modal DNNs. Finally, we conduct a case study and extend our benchmark to edge devices. We hope that our work can provide insights for future software/hardware design and optimization to underpin multi-modal DNNs on both cloud and edge computing platforms.

翻译：各类大数据规模的爆炸式增长与人工智能技术的进步催生了一种新型工作负载——多模态深度神经网络。多模态深度神经网络能够解释并推理来自多种模态的信息，使其更适用于现实世界的AI场景。近期研究表明，从传统多媒体系统到新兴自主边缘系统等众多分布式计算应用中，多模态深度神经网络已超越最优单模态深度神经网络的性能。然而，尽管具有重要性与优越性，当前对多模态深度神经网络特性及其对计算软硬件平台影响的研究仍极为有限。现有基准测试或聚焦于单模态深度神经网络，或仅侧重多模态深度神经网络的算法特性，缺乏能够对多模态网络进行系统级与体系结构级全面分析的典型基准测试套件。为增进对这类多模态深度神经网络工作负载的理解并推动相关研究，我们提出MMBench——一个开源的端到端基准测试套件，包含一系列真实多模态深度神经网络工作负载及相应性能评估指标。随后，我们利用MMBench对多模态深度神经网络的特性进行深入分析，揭示了其区别于传统单模态深度神经网络的独特特征：清晰的阶段性执行、频繁的同步机制以及高度异构性。最后，我们开展案例研究并将基准测试扩展至边缘设备。期望本研究能为未来软硬件设计与优化提供洞见，以支撑云端与边缘计算平台上的多模态深度神经网络应用。