Several approaches to graphically representing context-specific relations among jointly distributed categorical variables have been proposed, along with structure learning algorithms. While existing optimization-based methods have limited scalability due to the large number of context-specific models, the constraint-based methods are more prone to error than even constraint-based DAG learning algorithms since more relations must be tested. We present a hybrid algorithm for learning context-specific models that scales to hundreds of variables while testing no more constraints than standard DAG learning algorithms. Scalable learning is achieved through a combination of an order-based MCMC algorithm and sparsity assumptions analogous to those typically invoked for DAG models. To implement the method, we solve a special case of an open problem recently posed by Alon and Balogh. The method is shown to perform well on synthetic data and real world examples, in terms of both accuracy and scalability.
翻译:针对联合分布分类变量间上下文特定关系的图形化表示及结构学习算法,已有多种方法被提出。现有优化方法因需处理大量上下文特定模型而扩展性受限,基于约束的方法则比基于约束的DAG学习算法更容易出错(因需测试更多关系)。我们提出一种混合算法用于学习上下文特定模型,该算法在测试约束数量不超过标准DAG学习算法的前提下,可扩展至数百个变量。通过结合基于序的MCMC算法与DAG模型常用的稀疏性假设,实现了可扩展学习。为实现该方法,我们解决了Alon与Balogh近期提出的一个开放问题的特例。在合成数据与真实世界案例上的实验表明,该方法在准确性与可扩展性方面均表现优异。