Concept learning is a form of supervised machine learning that operates on knowledge bases in description logics. State-of-the-art concept learners often rely on an iterative search through a countably infinite concept space. In each iteration, they retrieve instances of candidate solutions to select the best concept for the next iteration. While simple learning problems might require a few dozen instance retrieval calls to find a fitting solution, complex learning problems might necessitate thousands of calls. We alleviate the resulting runtime challenge by presenting a semantics-aware caching approach. Our cache is essentially a subsumption-aware map that links concepts to a set of instances via crisp set operations. Our experiments on 5 datasets with 4 symbolic reasoners, a neuro-symbolic reasoner, and 5 popular pagination policies demonstrate that our cache can reduce the runtime of concept retrieval and concept learning by an order of magnitude while being effective for both symbolic and neuro-symbolic reasoners.
翻译:概念学习是一种在描述逻辑知识库上运行的监督机器学习形式。当前最先进的概念学习器通常依赖于在可数无限概念空间中的迭代搜索。在每次迭代中,它们检索候选解的实例以选择用于下一次迭代的最佳概念。虽然简单的学习问题可能只需要几十次实例检索调用来找到合适的解,但复杂的学习问题可能需要数千次调用。我们通过提出一种语义感知缓存方法来缓解由此产生的运行时挑战。我们的缓存本质上是一个包含感知映射,通过清晰集合操作将概念与一组实例关联起来。我们在5个数据集上使用4种符号推理器、一种神经符号推理器和5种常用分页策略进行的实验表明,我们的缓存可以将概念检索和概念学习的运行时间减少一个数量级,同时对符号和神经符号推理器均有效。