In the last decade, recent successes in deep clustering majorly involved the Mutual Information (MI) as an unsupervised objective for training neural networks with increasing regularisations. While the quality of the regularisations have been largely discussed for improvements, little attention has been dedicated to the relevance of MI as a clustering objective. In this paper, we first highlight how the maximisation of MI does not lead to satisfying clusters. We identified the Kullback-Leibler divergence as the main reason of this behaviour. Hence, we generalise the mutual information by changing its core distance, introducing the Generalised Mutual Information (GEMINI): a set of metrics for unsupervised neural network training. Unlike MI, some GEMINIs do not require regularisations when training as they are geometry-aware thanks to distances or kernels in the data space. Finally, we highlight that GEMINIs can automatically select a relevant number of clusters, a property that has been little studied in deep discriminative clustering context where the number of clusters is a priori unknown.
翻译:在过去十年中,深度聚类领域的成功主要依赖于互信息作为无监督目标,通过增强正则化来训练神经网络。尽管正则化的质量改进已得到广泛讨论,但互信息作为聚类目标的相关性却很少被关注。本文首先揭示了最大化互信息无法产生令人满意的聚类结果,并指出Kullback-Leibler散度是导致这一现象的主要原因。为此,我们通过改变其核心距离来推广互信息,提出了广义互信息:一组适用于无监督神经网络训练的度量标准。与互信息不同,某些广义互信息在训练过程中无需正则化,因为它们通过数据空间中的距离或核函数实现了几何感知能力。最后,我们强调广义互信息能够自动选择合理的聚类数量——这一特性在深度判别式聚类(聚类数量先验未知)中鲜有研究。