The objective of multi-view unsupervised feature and instance co-selection is to simultaneously iden-tify the most representative features and samples from multi-view unlabeled data, which aids in mit-igating the curse of dimensionality and reducing instance size to improve the performance of down-stream tasks. However, existing methods treat feature selection and instance selection as two separate processes, failing to leverage the potential interactions between the feature and instance spaces. Addi-tionally, previous co-selection methods for multi-view data require concatenating different views, which overlooks the consistent information among them. In this paper, we propose a CONsistency and DivErsity learNing-based multi-view unsupervised Feature and Instance co-selection (CONDEN-FI) to address the above-mentioned issues. Specifically, CONDEN-FI reconstructs mul-ti-view data from both the sample and feature spaces to learn representations that are consistent across views and specific to each view, enabling the simultaneous selection of the most important features and instances. Moreover, CONDEN-FI adaptively learns a view-consensus similarity graph to help select both dissimilar and similar samples in the reconstructed data space, leading to a more diverse selection of instances. An efficient algorithm is developed to solve the resultant optimization problem, and the comprehensive experimental results on real-world datasets demonstrate that CONDEN-FI is effective compared to state-of-the-art methods.
翻译:多视图无监督特征与实例协同选择的目标是从多视图无标签数据中同时识别最具代表性的特征与样本,这有助于缓解维度灾难并减少实例规模,从而提升下游任务的性能。然而,现有方法将特征选择与实例选择视为两个独立的过程,未能利用特征空间与实例空间之间潜在的相互作用。此外,以往针对多视图数据的协同选择方法需要拼接不同视图,这忽略了视图间的一致性信息。本文提出一种基于一致性与多样性学习的多视图无监督特征与实例协同选择方法(CONDEN-FI),以解决上述问题。具体而言,CONDEN-FI 从样本空间和特征空间两个角度重构多视图数据,以学习跨视图一致且视图特定的表示,从而能够同时选择最重要的特征与实例。此外,CONDEN-FI 自适应地学习一个视图共识相似性图,以帮助在重构的数据空间中选择既相异又相似的样本,从而实现更具多样性的实例选择。本文开发了一种高效算法来求解所得优化问题,在真实数据集上的综合实验结果表明,与现有先进方法相比,CONDEN-FI 是有效的。