Association Rule Mining (ARM) is the task of discovering commonalities in data in the form of logical implications. ARM is used in the Internet of Things (IoT) for different tasks including monitoring and decision-making. However, existing methods give limited consideration to IoT-specific requirements such as heterogeneity and volume. Furthermore, they do not utilize important static domain-specific description data about IoT systems, which is increasingly represented as knowledge graphs. In this paper, we propose a novel ARM pipeline for IoT data that utilizes both dynamic sensor data and static IoT system metadata. Furthermore, we propose an Autoencoder-based Neurosymbolic ARM method (Aerial) as part of the pipeline to address the high volume of IoT data and reduce the total number of rules that are resource-intensive to process. Aerial learns a neural representation of a given data and extracts association rules from this representation by exploiting the reconstruction (decoding) mechanism of an autoencoder. Extensive evaluations on 3 IoT datasets from 2 domains show that ARM on both static and dynamic IoT data results in more generically applicable rules while Aerial can learn a more concise set of high-quality association rules than the state-of-the-art with full coverage over the datasets.
翻译:关联规则挖掘(ARM)是通过逻辑蕴含形式发现数据共性的任务。在物联网(IoT)中,ARM被用于监测和决策等不同任务。然而,现有方法对物联网特有需求(如异构性和数据规模)的考量有限,且未能充分利用关于物联网系统的重要静态领域特定描述数据——这类数据正日益以知识图谱的形式表示。本文提出了一种新颖的物联网数据ARM流程,该流程同时利用动态传感器数据与静态物联网系统元数据。此外,作为该流程的一部分,我们提出了一种基于自编码器的神经符号ARM方法(Aerial),以应对物联网数据规模庞大的挑战,并减少需要大量资源处理的规则总数。Aerial通过学习给定数据的神经表示,并利用自编码器的重构(解码)机制从该表示中提取关联规则。在来自两个领域的三个物联网数据集上的大量评估表明:基于静态与动态物联网数据的ARM能够产生更具普适性的规则,同时Aerial能够学习到比现有最优方法更简洁的高质量关联规则集合,并实现对数据集的完全覆盖。