M3BAT: Unsupervised Domain Adaptation for Multimodal Mobile Sensing with Multi-Branch Adversarial Training

from arxiv, Accepted at the Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT). Paper will be presented at ACM UbiComp 2024

Over the years, multimodal mobile sensing has been used extensively for inferences regarding health and well being, behavior, and context. However, a significant challenge hindering the widespread deployment of such models in real world scenarios is the issue of distribution shift. This is the phenomenon where the distribution of data in the training set differs from the distribution of data in the real world, the deployment environment. While extensively explored in computer vision and natural language processing, and while prior research in mobile sensing briefly addresses this concern, current work primarily focuses on models dealing with a single modality of data, such as audio or accelerometer readings, and consequently, there is little research on unsupervised domain adaptation when dealing with multimodal sensor data. To address this gap, we did extensive experiments with domain adversarial neural networks (DANN) showing that they can effectively handle distribution shifts in multimodal sensor data. Moreover, we proposed a novel improvement over DANN, called M3BAT, unsupervised domain adaptation for multimodal mobile sensing with multi-branch adversarial training, to account for the multimodality of sensor data during domain adaptation with multiple branches. Through extensive experiments conducted on two multimodal mobile sensing datasets, three inference tasks, and 14 source-target domain pairs, including both regression and classification, we demonstrate that our approach performs effectively on unseen domains. Compared to directly deploying a model trained in the source domain to the target domain, the model shows performance increases up to 12% AUC (area under the receiver operating characteristics curves) on classification tasks, and up to 0.13 MAE (mean absolute error) on regression tasks.

翻译：多年来，多模态移动传感已广泛应用于健康福祉、行为和环境等领域的推理任务。然而，阻碍此类模型在真实场景中大规模部署的关键挑战是分布偏移现象——即训练集数据分布与真实部署环境数据分布不一致的问题。尽管该问题在计算机视觉和自然语言处理领域已被广泛研究，且移动传感领域的先前研究也对此有所涉及，但当前工作主要聚焦于处理单一模态数据（如音频或加速度计读数）的模型，因此针对多模态传感器数据的无监督域适应研究十分有限。为填补这一空白，我们通过对域对抗神经网络（DANN）开展大量实验，证明其能有效处理多模态传感器数据中的分布偏移。此外，我们提出了DANN的改进方案M3BAT——基于多分支对抗训练的多模态移动传感无监督域适应方法，通过多分支结构在域适应过程中考虑传感器数据的多模态特性。基于两个多模态移动传感数据集、三个推理任务及14个源-目标域对（涵盖回归与分类任务）的广泛实验表明，我们的方法在未知域上表现优异。与直接将源域训练模型部署到目标域相比，本模型在分类任务上AUC（受试者工作特征曲线下面积）提升高达12%，在回归任务上MAE（平均绝对误差）减少高达0.13。