SPILDL: A Scalable and Parallel Inductive Learner in Description Logic

We present SPILDL, a Scalable and Parallel Inductive Learner in Description Logic (DL). SPILDL is based on the DL-Learner (the state of the art in DL-based ILP learning). As a DL-based ILP learner, SPILDL targets the $\mathcal{ALCQI}^{\mathcal{(D)}}$ DL language, and can learn DL hypotheses expressed as disjunctions of conjunctions (using the $\sqcup$ operator). Moreover, SPILDL's hypothesis language also incorporates the use of string concrete roles (also known as string data properties in the Web Ontology Language, OWL); As a result, this incorporation of powerful DL constructs, enables SPILDL to learn powerful DL-based hypotheses for describing many real-world complex concepts. SPILDL employs a hybrid parallel approach which combines both shared-memory and distributed-memory approaches, to accelerates ILP learning (for both hypothesis search and evaluation). According to experimental results, SPILDL's parallel search improved performance by up to $\sim$27.3 folds (best case). For hypothesis evaluation, SPILDL improved evaluation performance through HT-HEDL (our multi-core CPU + multi-GPU hypothesis evaluation engine), by up to 38 folds (best case). By combining both parallel search and evaluation, SPILDL improved performance by up to $\sim$560 folds (best case). In terms of worst case scenario, SPILDL's parallel search doesn't provide consistent speedups on all datasets, and is highly dependent on the search space nature of the ILP dataset. For some datasets, increasing the number of parallel search threads result in reduced performance, similar or worse than baseline. Some ILP datasets benefit from parallel search, while others don't (or the performance gains are negligible). In terms of parallel evaluation, on small datasets, parallel evaluation provide similar or worse performance than baseline.

翻译：本文提出SPILDL，一种可扩展的并行描述逻辑（DL）归纳学习器。SPILDL基于当前最先进的描述逻辑归纳逻辑程序设计（ILP）学习系统DL-Learner。作为基于描述逻辑的ILP学习器，SPILDL面向$\mathcal{ALCQI}^{\mathcal{(D)}}$描述逻辑语言，能够学习以析取合取式（使用$\sqcup$运算符）表示的描述逻辑假设。此外，SPILDL的假设语言还引入了字符串具体角色（在Web本体语言OWL中亦称为字符串数据属性）；这种强大描述逻辑构造的融入，使得SPILDL能够学习具有强表达能力的描述逻辑假设，以刻画众多现实世界的复杂概念。SPILDL采用共享内存与分布式内存相结合的混合并行方法，以加速ILP学习过程（包括假设搜索与评估两个阶段）。实验结果表明：SPILDL的并行搜索在最佳情况下可实现约27.3倍的性能提升；在假设评估方面，通过HT-HEDL（我们开发的多核CPU+多GPU假设评估引擎）最高可获得38倍的加速效果。综合并行搜索与评估，SPILDL在最佳情况下整体性能提升约560倍。在最差情况下，SPILDL的并行搜索并非在所有数据集上都能提供稳定的加速效果，其性能高度依赖于ILP数据集的搜索空间特性。对于某些数据集，增加并行搜索线程数反而会导致性能下降，与基线水平相当或更差。部分ILP数据集能从并行搜索中获益，而其他数据集则不能（或性能增益可忽略不计）。在并行评估方面，对于小型数据集，并行评估提供的性能与基线水平相当或更差。

相关内容

ILP

关注 132

归纳逻辑程序设计（ILP）是机器学习的一个分支，它依赖于逻辑程序作为一种统一的表示语言来表达例子、背景知识和假设。基于一阶逻辑的ILP具有很强的表示形式，为多关系学习和数据挖掘提供了一种很好的方法。International Conference on Inductive Logic Programming系列始于1991年，是学习结构化或半结构化关系数据的首要国际论坛。最初专注于逻辑程序的归纳，多年来，它大大扩展了研究范围，并欢迎在逻辑学习、多关系数据挖掘、统计关系学习、图形和树挖掘等各个方面作出贡献，学习其他（非命题）基于逻辑的知识表示框架，探索统计学习和其他概率方法的交叉点。官网链接：https://ilp2019.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日