Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised Person Re-Identification and Text Authorship Attribution

Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution. Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences, as intra-class distances are substantially lower than inter-class distances. However, this is not the case for forensic applications in which classes have similar semantics and the training and test sets have disjoint identities. General self-supervised learning methods might fail to learn discriminative features in this scenario, thus requiring more robust strategies. We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse. We propose a novel ensemble-based clustering strategy whereby clusters derived from different configurations are combined to generate a better grouping for the data samples in a fully-unsupervised way. This strategy allows clusters with different densities and higher variability to emerge, reducing intra-class discrepancies without requiring the burden of finding an optimal configuration per dataset. We also consider different Convolutional Neural Networks for feature extraction and subsequent distance computations between samples. We refine these distances by incorporating context and grouping them to capture complementary information. Our method is robust across both tasks, with different data modalities, and outperforms state-of-the-art methods with a fully-unsupervised solution without any labeling or human intervention.

翻译：从完全无标注数据中学习在多媒体取证问题（如行人重识别和文本作者归属）中极具挑战性。近期自监督学习方法在底层类别具有显著语义差异的场景下（类内距离远小于类间距离）对完全无标注数据表现出有效性。然而，取证应用中类别语义相似且训练集与测试集身份互不相交，通用自监督学习方法在此场景下难以学习判别性特征，需要更鲁棒的策略。我们提出一种策略，通过即使在不同类别样本间缺乏显著差异时也能从无标注数据中学习，来解决行人重识别和文本作者归属问题。我们提出一种基于集成的聚类新策略，将不同配置产生的聚类结果组合，以完全无监督的方式为数据样本生成更优分组。该策略允许生成密度更高、变异性更大的聚类，在无需为每个数据集寻找最优配置负担的前提下减少类内差异。我们还考虑采用不同卷积神经网络进行特征提取及后续样本间距离计算，通过融入上下文信息并聚合距离以捕捉互补特征来优化距离度量。本方法在跨数据模态的两种任务中均表现鲁棒，以完全无监督方案（无需任何标注或人工干预）超越现有最先进方法。