Dependencies on the relative frequency of a state in the domain are common when modelling probabilistic dependencies on relational data. For instance, the likelihood of a school closure during an epidemic might depend on the proportion of infected pupils exceeding a threshold. Often, rather than depending on discrete thresholds, dependencies are continuous: for instance, the likelihood of any one mosquito bite transmitting an illness depends on the proportion of carrier mosquitoes. Current approaches usually only consider probabilities over possible worlds rather than over domain elements themselves. An exception are the recently introduced Lifted Bayesian Networks for Conditional Probability Logic, which express discrete dependencies on probabilistic data. We introduce functional lifted Bayesian networks, a formalism that explicitly incorporates continuous dependencies on relative frequencies into statistical relational artificial intelligence. and compare and contrast them with ifted Bayesian Networks for Conditional Probability Logic. Incorporating relative frequencies is not only beneficial to modelling; it also provides a more rigorous approach to learning problems where training and test or application domains have different sizes. To this end, we provide a representation of the asymptotic probability distributions induced by functional lifted Bayesian networks on domains of increasing sizes. Since that representation has well-understood scaling behaviour across domain sizes, it can be used to estimate parameters for a large domain consistently from randomly sampled subpopulations. Furthermore, we show that in parametric families of FLBN, convergence is uniform in the parameters, which ensures a meaningful dependence of the asymptotic probabilities on the parameters of the model.
翻译:在建模关系数据中的概率依赖时,常涉及对领域中状态相对频率的依赖。例如,疫情期间学校停课的概率可能取决于感染学生比例是否超过阈值。通常,这种依赖并非离散阈值形式,而是连续的:例如,蚊虫叮咬传播疾病的概率取决于携带病原体的蚊子比例。现有方法通常仅考虑可能世界上的概率,而非域元素本身的概率。近期提出的条件概率逻辑提升贝叶斯网络是一个例外,它表达了概率数据上的离散依赖。我们引入函数式提升贝叶斯网络,这是一种将相对频率的连续依赖显式纳入统计关系人工智能的形式化方法,并与条件概率逻辑提升贝叶斯网络进行对比分析。纳入相对频率不仅有利于建模,还能为训练域与测试/应用域规模不同的学习问题提供更严谨的方法。为此,我们提出了由函数式提升贝叶斯网络在递增规模域上诱导的渐近概率分布的表示。由于该表示在域规模变化时具有良定义的缩放行为,可用于从随机采样子群体中一致估计大规模域的参数。此外,我们证明在函数式提升贝叶斯网络的参数族中,收敛性在参数上是一致的,这确保了渐近概率对模型参数具有有意义的依赖性。