Bayesian networks model relationships between random variables under uncertainty and can be used to predict the likelihood of events and outcomes while incorporating observed evidence. From an eXplainable AI (XAI) perspective, such models are interesting as they tend to be compact. Moreover, captured relations can be directly inspected by domain experts. In practice, data is often real-valued. Unless assumptions of normality can be made, discretization is often required. The optimal discretization, however, depends on the relations modelled between the variables. This complicates learning Bayesian networks from data. For this reason, most literature focuses on learning conditional dependencies between sets of variables, called structure learning. In this work, we extend an existing state-of-the-art structure learning approach based on the Gene-pool Optimal Mixing Evolutionary Algorithm (GOMEA) to jointly learn variable discretizations. The proposed Discretized Bayesian Network GOMEA (DBN-GOMEA) obtains similar or better results than the current state-of-the-art when tasked to retrieve randomly generated ground-truth networks. Moreover, leveraging a key strength of evolutionary algorithms, we can straightforwardly perform DBN learning multi-objectively. We show how this enables incorporating expert knowledge in a uniquely insightful fashion, finding multiple DBNs that trade-off complexity, accuracy, and the difference with a pre-determined expert network.
翻译:贝叶斯网络能够建模不确定性下随机变量之间的关系,并可用于在考虑观测证据的情况下预测事件和结果的可能性。从可解释人工智能的角度来看,这类模型因通常具有紧凑性而颇具吸引力。此外,捕获的变量关系可被领域专家直接检查。在实际应用中,数据往往是实数值的。除非能够假设正态性,否则通常需要进行离散化处理。然而,最优离散化方案取决于变量之间建模的关系,这增加了从数据中学习贝叶斯网络的复杂性。因此,现有文献主要关注变量集合间条件依赖关系的学习(即结构学习)。本研究将基于基因池最优混合进化算法的现有最优结构学习方法进行扩展,实现变量离散化的联合学习。所提出的离散化贝叶斯网络GOMEA方法在恢复随机生成的真实网络任务中,可取得与当前最优方法相当或更优的结果。此外,利用进化算法的核心优势,我们能够直接进行多目标离散化贝叶斯网络学习。研究表明,该方法能以独特且富有洞察力的方式融入专家知识,在复杂度、准确度及与预设专家网络差异之间权衡,发现多个离散化贝叶斯网络。