Output space pattern sampling is a powerful alternative to exhaustive pattern mining for exploring large pattern spaces, as it enables users to focus on representative patterns drawn according to a chosen interestingness measure. In this paper, we address the problem of sampling interval patterns under user-defined syntactic constraints. We introduce CFips, a sampling approach that incorporates constraints directly into the sampling procedure. The approach relies on a multi-step sampling framework and supports several syntactic constraints by decomposing them into elementary predicates on interval bounds while preserving exact sampling guarantees. We formally prove that CFips samples interval patterns proportionally to their frequency within the constrained pattern space. The experimental results show that integrating constraints into the sampling procedure enables to complete mining tasks that would otherwise fail within a given time out.
翻译:输出空间模式采样是穷举模式挖掘的一种强大替代方案,用于探索大规模模式空间,因为它使用户能够根据选定的兴趣度量关注代表性模式。本文研究了在用户定义的语法约束下对区间模式进行采样的问题。我们提出了CFips,一种将约束直接融入采样过程的采样方法。该方法基于多步采样框架,通过将多种语法约束分解为区间边界上的基本谓词来支持这些约束,同时保持精确采样保证。我们形式化地证明了CFips在约束模式空间内,按照其频率成比例地采样区间模式。实验结果表明,将约束集成到采样过程中,能够完成在给定超时条件下原本会失败的挖掘任务。