Multiple data views measured on the same set of participants is becoming more common and has the potential to deepen our understanding of many complex diseases by analyzing these different views simultaneously. Equally important, many of these complex diseases show evidence of subgroup heterogeneity (e.g., by sex or race). HIP (Heterogeneity in Integration and Prediction) is among the first methods proposed to integrate multiple data views while also accounting for subgroup heterogeneity to identify common and subgroup-specific markers of a particular disease. However, HIP is applicable to continuous outcomes and requires programming expertise by the user. Here we propose extensions to HIP that accommodate multi-class, Poisson, and Zero-Inflated Poisson outcomes while retaining the benefits of HIP. Additionally, we introduce an R Shiny application, accessible on shinyapps.io at https://multi-viewlearn.shinyapps.io/HIP_ShinyApp/, that provides an interface with the Python implementation of HIP to allow more researchers to use the method anywhere and on any device. We applied HIP to identify genes and proteins common and specific to males and females that are associated with exacerbation frequency. Although some of the identified genes and proteins show evidence of a relationship with chronic obstructive pulmonary disease (COPD) in existing literature, others may be candidates for future research investigating their relationship with COPD. We demonstrate the use of the Shiny application with a publicly available data. An R-package for HIP would be made available at https://github.com/lasandrall/HIP.
翻译:同时分析同一组参与者的多种数据视图正变得越来越普遍,这有望通过同步分析这些不同视图来加深我们对许多复杂疾病的理解。同样重要的是,许多此类复杂疾病表现出亚组异质性(例如,按性别或种族划分)的证据。HIP(整合与预测中的异质性)是首批提出的方法之一,旨在整合多种数据视图的同时考虑亚组异质性,以识别特定疾病的公共和亚组特异性标记。然而,HIP仅适用于连续型结局变量,且需要用户具备编程专业知识。在此,我们提出HIP的扩展版本,该版本在保留HIP优势的同时,可处理多分类、泊松和零膨胀泊松结局。此外,我们引入了一个R Shiny应用程序(可通过shinyapps.io访问,网址为https://multi-viewlearn.shinyapps.io/HIP_ShinyApp/),该程序为HIP的Python实现提供交互界面,使更多研究人员能够在任何设备上随时随地使用该方法。我们应用HIP识别了与加重频率相关的男性和女性公共及特异性基因与蛋白质。尽管部分已识别的基因和蛋白质在现有文献中显示出与慢性阻塞性肺疾病(COPD)的关联证据,但其他基因和蛋白质可能成为未来研究其与COPD关系的新候选对象。我们还通过公开可用的数据演示了该Shiny应用程序的使用。HIP的R包将在https://github.com/lasandrall/HIP上提供。