In many real-world scenarios, it is crucial to be able to reliably and efficiently reason under uncertainty while capturing complex relationships in data. Probabilistic circuits (PCs), a prominent family of tractable probabilistic models, offer a remedy to this challenge by composing simple, tractable distributions into a high-dimensional probability distribution. However, learning PCs on heterogeneous data is challenging and densities of some parametric distributions are not available in closed form, limiting their potential use. We introduce characteristic circuits (CCs), a family of tractable probabilistic models providing a unified formalization of distributions over heterogeneous data in the spectral domain. The one-to-one relationship between characteristic functions and probability measures enables us to learn high-dimensional distributions on heterogeneous data domains and facilitates efficient probabilistic inference even when no closed-form density function is available. We show that the structure and parameters of CCs can be learned efficiently from the data and find that CCs outperform state-of-the-art density estimators for heterogeneous data domains on common benchmark data sets.
翻译:在众多实际场景中,能够可靠且高效地在不确定性下进行推理,同时捕捉数据中的复杂关系至关重要。概率电路(PCs)作为一类重要的可处理概率模型,通过将简单、可处理的分布组合成高维概率分布,为这一挑战提供了解决方案。然而,在异质数据上学习概率电路具有挑战性,且某些参数分布的密度函数不存在封闭形式,限制了其潜在应用。我们引入了特征电路(CCs),这是一类可处理的概率模型,在谱域中为异质数据上的分布提供了统一的形式化表示。特征函数与概率测度之间的一一对应关系使我们能够在异质数据域上学习高维分布,并在无封闭形式密度函数的情况下实现高效的概率推理。我们证明,CCs的结构和参数可以从数据中高效学习,并发现CCs在常见基准数据集上对异质数据域的密度估计优于最先进的方法。