Estimating dependence relationships between variables is a crucial issue in many applied domains, such as medicine, social sciences and psychology. When several variables are entertained, these can be organized into a network which encodes their set of conditional dependence relations. Typically however, the underlying network structure is completely unknown or can be partially drawn only; accordingly it should be learned from the available data, a process known as structure learning. In addition, data arising from social and psychological studies are often of different types, as they can include categorical, discrete and continuous measurements. In this paper we develop a novel Bayesian methodology for structure learning of directed networks which applies to mixed data, i.e. possibly containing continuous, discrete, ordinal and binary variables simultaneously. Whenever available, our method can easily incorporate known dependence structures among variables represented by paths or edge directions that can be postulated in advance based on the specific problem under consideration. We evaluate the proposed method through extensive simulation studies, with appreciable performances in comparison with current state-of-the-art alternative methods. Finally, we apply our methodology to well-being data from a social survey promoted by the United Nations, and mental health data collected from a cohort of medical students.
翻译:估计变量之间的依赖关系是医学、社会科学和心理学等许多应用领域中的关键问题。当涉及多个变量时,这些变量可以组织成一个网络,该网络编码了它们的一组条件依赖关系。然而,通常情况下,底层网络结构完全未知或只能部分得出;因此,它需要从可用数据中学习,这一过程称为结构学习。此外,社会和心理研究中产生的数据通常属于不同类型,因为它们可以包括分类、离散和连续测量。在本文中,我们开发了一种新颖的贝叶斯方法,用于有向网络的结构学习,该方法适用于混合数据,即可能同时包含连续、离散、有序和二元变量。在可用的情况下,我们的方法可以轻松地纳入基于特定问题预先假设的路径或边方向所表示的变量间已知依赖结构。我们通过广泛的模拟研究评估了所提出的方法,与当前最先进的替代方法相比,性能表现显著。最后,我们将我们的方法应用于联合国推动的一项社会调查中的幸福感数据,以及从一组医学生队列中收集的心理健康数据。