Association Rule Mining (ARM) is a fundamental task for knowledge discovery in tabular data and is widely used in high-stakes decision-making. Classical ARM methods rely on frequent itemset mining, leading to rule explosion and poor scalability, while recent neural approaches mitigate these issues but suffer from degraded performance in low-data regimes. Tabular foundation models (TFMs), pretrained on diverse tabular data with strong in-context generalization, provide a basis for addressing these limitations. We introduce a model-agnostic association rule learning framework that extracts association rules from any conditional probabilistic model over tabular data, enabling us to leverage TFMs. We then introduce TabProbe, an instantiation of our framework that utilizes TFMs as conditional probability estimators to learn association rules out-of-the-box without frequent itemset mining. We evaluate our approach on tabular datasets of varying sizes based on standard ARM rule quality metrics and downstream classification performance. The results show that TFMs consistently produce concise, high-quality association rules with strong predictive performance and remain robust in low-data settings without task-specific training. Source code is available at https://github.com/DiTEC-project/tabprobe.
翻译:关联规则挖掘(ARM)是表格数据知识发现的一项基本任务,广泛应用于高风险决策中。经典的ARM方法依赖于频繁项集挖掘,导致规则爆炸和可扩展性差,而近期的神经方法虽缓解了这些问题,但在低数据量场景下性能显著下降。表格基础模型(TFMs)通过在多样化表格数据上进行预训练,具备强大的上下文泛化能力,为解决这些局限性提供了基础。我们提出了一种模型无关的关联规则学习框架,该框架能够从任何表格数据的条件概率模型中提取关联规则,从而使得我们能够利用TFMs。随后,我们介绍了TabProbe,这是我们框架的一个具体实现,它利用TFMs作为条件概率估计器,无需频繁项集挖掘即可开箱即用地学习关联规则。我们在不同规模的表格数据集上,基于标准ARM规则质量指标和下游分类性能评估了我们的方法。结果表明,TFMs能够持续生成简洁、高质量的关联规则,具备强大的预测性能,并且在低数据量场景下无需任务特定训练仍保持稳健。源代码可在 https://github.com/DiTEC-project/tabprobe 获取。