Objective Function Mismatch (OFM) occurs when the optimization of one objective has a negative impact on the optimization of another objective. In this work we study OFM in deep clustering, and find that the popular autoencoder-based approach to deep clustering can lead to both reduced clustering performance, and a significant amount of OFM between the reconstruction and clustering objectives. To reduce the mismatch, while maintaining the structure-preserving property of an auxiliary objective, we propose a set of new auxiliary objectives for deep clustering, referred to as the Unsupervised Companion Objectives (UCOs). The UCOs rely on a kernel function to formulate a clustering objective on intermediate representations in the network. Generally, intermediate representations can include other dimensions, for instance spatial or temporal, in addition to the feature dimension. We therefore argue that the na\"ive approach of vectorizing and applying a vector kernel is suboptimal for such representations, as it ignores the information contained in the other dimensions. To address this drawback, we equip the UCOs with structure-exploiting tensor kernels, designed for tensors of arbitrary rank. The UCOs can thus be adapted to a broad class of network architectures. We also propose a novel, regression-based measure of OFM, allowing us to accurately quantify the amount of OFM observed during training. Our experiments show that the OFM between the UCOs and the main clustering objective is lower, compared to a similar autoencoder-based model. Further, we illustrate that the UCOs improve the clustering performance of the model, in contrast to the autoencoder-based approach. The code for our experiments is available at https://github.com/danieltrosten/tk-uco.
翻译:目标函数失配(Objective Function Mismatch, OFM)是指一个目标的优化对另一个目标的优化产生负面影响的现象。本文研究了深度聚类中的OFM,发现基于自编码器的流行深度聚类方法会导致聚类性能下降,并在重构目标与聚类目标之间产生显著的OFM。为在保持辅助目标结构保持特性的同时减少失配,我们提出了一组新的深度聚类辅助目标,称为无监督伴随目标(Unsupervised Companion Objectives, UCOs)。UCOs利用核函数对网络中的中间表示构建聚类目标。通常,中间表示除特征维度外还可包含其他维度(例如空间或时间维度)。因此,我们认为将此类表示向量化并应用向量核的朴素方法忽略了其他维度包含的信息,是次优的。为解决这一缺陷,我们为UCOs配备了可适用于任意秩张量的结构感知张量核。这使得UCOs能够适配广泛类别的网络架构。我们还提出了一种基于回归的新型OFM度量方法,可准确量化训练过程中观测到的OFM程度。实验表明,与基于自编码器的类似模型相比,UCOs与主聚类目标之间的OFM更低。此外,我们证明与自编码器方法相反,UCOs能提升模型的聚类性能。实验代码已开源至https://github.com/danieltrosten/tk-uco。