Distributional shifts between training and inference time data remain a central challenge in machine learning, often leading to poor performance. It motivated the study of principled approaches for domain alignment, such as optimal transport based unsupervised domain adaptation, that relies on approximating Monge map using transport plans, which is sensitive to the transport problem regularization strategy and hyperparameters, and might yield biased domains alignment. In this work, we propose to interpret smoothed transport plans as adjacency matrices of bipartite graphs connecting source to target domain and derive domain-invariant samples' representations through spectral embedding. We evaluate our approach on acoustic adaptation benchmarks for music genre recognition, music-speech discrimination, as well as electrical cable defect detection and classification tasks using time domain reflection in different diagnosis settings, achieving overall strong performances.
翻译:训练数据与推理数据之间的分布偏移仍然是机器学习的核心挑战,常导致性能下降。这推动了领域对齐原理性方法的研究,例如基于最优传输的无监督域自适应方法。该方法依赖于利用传输计划近似Monge映射,但其对传输问题正则化策略和超参数敏感,可能导致有偏的域对齐。本工作中,我们将平滑化传输计划解释为连接源域与目标域的二部图的邻接矩阵,并通过谱嵌入推导域不变样本表示。我们在音乐流派识别、音乐-语音辨别的声学适应基准任务,以及不同诊断场景下使用时域反射的电缆缺陷检测与分类任务中评估了所提方法,均取得了优异的整体性能。