Knowledge Discovery in Databases (KDD) aims to exploit the vast amounts of data generated daily across various domains of computer applications. Its objective is to extract hidden and meaningful knowledge from datasets through a structured process comprising several key steps: data selection, preprocessing, transformation, data mining, and visualization. Among the core data mining techniques are classification and clustering. Classification involves predicting the class of new instances using a classifier trained on labeled data. Several approaches have been proposed in the literature, including Decision Tree Induction, Bayesian classifiers, Nearest Neighbor search, Neural Networks, Support Vector Machines, and Formal Concept Analysis (FCA). The last one is recognized as an effective approach for interpretable and explainable learning. It is grounded in the mathematical structure of the concept lattice, which enables the generation of formal concepts and the discovery of hidden relationships among them. In this paper, we present a state-of-theart review of FCA-based classifiers. We explore various methods for computing closure operators from nominal data and introduce a novel approach for constructing a partial concept lattice that focuses on the most relevant concepts. Experimental results are provided to demonstrate the efficiency of the proposed method.
翻译:数据库知识发现旨在利用计算机应用各领域每日产生的海量数据。其目标是通过包含数据选择、预处理、转换、数据挖掘和可视化等关键步骤的结构化流程,从数据集中提取隐含且有意义的知识。分类与聚类是核心数据挖掘技术的重要组成部分。分类涉及使用基于标注数据训练的分类器预测新实例的类别。文献中已提出多种方法,包括决策树归纳、贝叶斯分类器、最近邻搜索、神经网络、支持向量机以及形式概念分析。后者被公认为一种实现可解释与可说明学习的有效方法。该方法建立在概念格的数学结构基础上,能够生成形式概念并发现其间的隐含关系。本文对基于FCA的分类器进行了前沿综述。我们探讨了从名义数据计算闭包算子的多种方法,并提出了一种构建专注于最相关概念的偏序概念格的新方法。实验结果表明了所提方法的有效性。