The rise of mobile devices equipped with numerous sensors, such as LiDAR and cameras, has spurred the adoption of multi-modal deep intelligence for distributed sensing tasks, such as smart cabins and driving assistance. However, the arrival times of mobile sensory data vary due to modality size and network dynamics, which can lead to delays (if waiting for slower data) or accuracy decline (if inference proceeds without waiting). Moreover, the diversity and dynamic nature of mobile systems exacerbate this challenge. In response, we present a shift to \textit{opportunistic} inference for asynchronous distributed multi-modal data, enabling inference as soon as partial data arrives. While existing methods focus on optimizing modality consistency and complementarity, known as modal affinity, they lack a \textit{computational} approach to control this affinity in open-world mobile environments. AdaFlow pioneers the formulation of structured cross-modality affinity in mobile contexts using a hierarchical analysis-based normalized matrix. This approach accommodates the diversity and dynamics of modalities, generalizing across different types and numbers of inputs. Employing an affinity attention-based conditional GAN (ACGAN), AdaFlow facilitates flexible data imputation, adapting to various modalities and downstream tasks without retraining. Experiments show that AdaFlow significantly reduces inference latency by up to 79.9\% and enhances accuracy by up to 61.9\%, outperforming status quo approaches.
翻译:随着配备多种传感器(如激光雷达和摄像头)的移动设备的兴起,多模态深度智能在分布式感知任务(如智能座舱和驾驶辅助)中的应用日益广泛。然而,由于模态数据大小和网络动态性的差异,移动传感数据的到达时间各不相同,这可能导致推理延迟(若等待较慢数据)或准确性下降(若未等待即进行推理)。此外,移动系统的多样性和动态性进一步加剧了这一挑战。为此,我们提出向异步分布式多模态数据的\textit{机会}推理转变,使得部分数据一旦到达即可进行推理。现有方法主要关注优化模态一致性和互补性(即模态亲和性),但缺乏在开放世界移动环境中控制这种亲和性的\textit{计算}方法。AdaFlow首次利用基于层次分析的归一化矩阵,在移动场景中构建了结构化的跨模态亲和性表述。该方法适应模态的多样性和动态性,可泛化至不同类型和数量的输入。通过采用基于亲和性注意力的条件生成对抗网络(ACGAN),AdaFlow实现了灵活的数据插补,无需重新训练即可适应各种模态和下游任务。实验表明,AdaFlow显著降低了推理延迟(最高达79.9\%)并提升了准确性(最高达61.9\%),性能优于现有方法。