Unsupervised domain adaptation (UDA) provides a strategy for improving machine learning performance in data-rich (target) domains where ground truth labels are inaccessible but can be found in related (source) domains. In cases where meta-domain information such as label distributions is available, weak supervision can further boost performance. We propose a novel framework, CALDA, to tackle these two problems. CALDA synergistically combines the principles of contrastive learning and adversarial learning to robustly support multi-source UDA (MS-UDA) for time series data. Similar to prior methods, CALDA utilizes adversarial learning to align source and target feature representations. Unlike prior approaches, CALDA additionally leverages cross-source label information across domains. CALDA pulls examples with the same label close to each other, while pushing apart examples with different labels, reshaping the space through contrastive learning. Unlike prior contrastive adaptation methods, CALDA requires neither data augmentation nor pseudo labeling, which may be more challenging for time series. We empirically validate our proposed approach. Based on results from human activity recognition, electromyography, and synthetic datasets, we find utilizing cross-source information improves performance over prior time series and contrastive methods. Weak supervision further improves performance, even in the presence of noise, allowing CALDA to offer generalizable strategies for MS-UDA. Code is available at: https://github.com/floft/calda
翻译:摘要:无监督域适应(UDA)为解决标签丰富的目标域(其真实标签不可获取但可在相关源域中找到)中机器学习性能提升问题提供了一种策略。当元域信息(如标签分布)可用时,弱监督可进一步提升模型性能。我们提出了一种新框架CALDA来应对这两个问题。CALDA通过协同融合对比学习与对抗学习原理,稳健支持时间序列数据的多源无监督域适应(MS-UDA)。与先前方法类似,CALDA利用对抗学习对齐源域与目标域的特征表示。不同于先前方法,CALDA额外跨域利用源域标签信息:通过对比学习,将相同标签的样本相互拉近,使不同标签的样本相互推远,重塑特征空间。与现有对比域适应方法不同,CALDA无需数据增强或伪标签(这对时间序列数据更具挑战性)。我们通过实验验证了所提方法。基于人体活动识别、肌电图及合成数据集的实验结果表明,利用跨源信息能提升优于先前时间序列方法及对比方法的性能。即便存在噪声干扰,弱监督仍能进一步改善性能,使CALDA为MS-UDA提供可泛化的策略。代码见:https://github.com/floft/calda