With the development of deep learning techniques, supervised learning has achieved performances surpassing those of humans. Researchers have designed numerous corresponding models for different data modalities, achieving excellent results in supervised tasks. However, with the exponential increase of data in multiple fields, the recognition and classification of unlabeled data have gradually become a hot topic. In this paper, we employed a Reinforcement Learning framework to simulate the cognitive processes of humans for effectively addressing novel class discovery in the Open-set domain. We deployed a Member-to-Leader Multi-Agent framework to extract and fuse features from multi-modal information, aiming to acquire a more comprehensive understanding of the feature space. Furthermore, this approach facilitated the incorporation of self-supervised learning to enhance model training. We employed a clustering method with varying constraint conditions, ranging from strict to loose, allowing for the generation of dependable labels for a subset of unlabeled data during the training phase. This iterative process is similar to human exploratory learning of unknown data. These mechanisms collectively update the network parameters based on rewards received from environmental feedback. This process enables effective control over the extent of exploration learning, ensuring the accuracy of learning in unknown data categories. We demonstrate the performance of our approach in both the 3D and 2D domains by employing the OS-MN40, OS-MN40-Miss, and Cifar10 datasets. Our approach achieves competitive competitive results.
翻译:随着深度学习技术的发展,监督学习在性能上已超越人类。研究者针对不同数据模态设计了大量相应模型,在监督任务中取得了优异成果。然而,随着多领域数据呈指数级增长,未标注数据的识别与分类逐渐成为研究热点。本文采用强化学习框架模拟人类认知过程,以有效解决开放集领域中的新类别发现问题。我们部署了"成员-领导者"多智能体框架,从多模态信息中提取并融合特征,旨在获得对特征空间更全面的理解。此外,该方法促进了自监督学习的融入以增强模型训练。我们采用从严格到宽松的变约束条件聚类方法,在训练阶段为部分未标注数据生成可靠标签。这一迭代过程类似于人类对未知数据的探索性学习。这些机制基于环境反馈的奖励值共同更新网络参数,从而有效控制探索学习的程度,确保对未知数据类别学习的准确性。我们通过在OS-MN40、OS-MN40-Miss和Cifar10数据集上的实验,展示了该方法在3D和2D领域的性能表现。本方法取得了具有竞争力的结果。