The identification of patient subgroups with comparable event-risk dynamics plays a key role in supporting informed decision-making in clinical research. In such settings, it is important to account for the inherent dependence that arises when individuals are nested within higher-level units, such as hospitals. Existing survival models account for group-level heterogeneity through frailty terms but do not uncover latent patient subgroups, while most clustering methods ignore hierarchical structure and are not estimated jointly with survival outcomes. In this work, we introduce a new framework that simultaneously performs patient clustering and shared-frailty survival modeling through a penalized likelihood approach. The proposed methodology adaptively learns a patient-to-patient similarity matrix via a modified version of spectral clustering, enabling cluster formation directly from estimated risk profiles while accounting for group membership. A simulation study highlights the proposed model's ability to recover latent clusters and to correctly estimate hazard parameters. We apply our method to a large cohort of heart-failure patients hospitalized with COVID-19 between 2020 and 2021 in the Lombardy region (Italy), identifying clinically meaningful subgroups characterized by distinct risk profiles and highlighting the role of respiratory comorbidities and hospital-level variability in shaping mortality outcomes. This framework provides a flexible and interpretable tool for risk-based patient stratification in hierarchical data settings.
翻译:识别具有相似事件风险动态的患者亚组在支持临床研究中的知情决策方面起着关键作用。在此类场景中,必须考虑当个体嵌套于更高层级单位(如医院)时产生的固有依赖性。现有的生存模型通过脆弱项考虑组水平异质性,但未能揭示潜在的患者亚组;而大多数聚类方法则忽略层次结构,且未与生存结局联合估计。本研究提出一种新框架,通过惩罚似然方法同时实现患者聚类与共享脆弱生存建模。所提出的方法通过改进版本的谱聚类自适应学习患者间相似性矩阵,从而能够直接从估计的风险特征中形成聚类,同时考虑组别归属。模拟研究证明了所提模型在恢复潜在聚类和正确估计风险参数方面的能力。我们将该方法应用于2020年至2021年间意大利伦巴第地区因COVID-19住院的大规模心力衰竭患者队列,识别出具有不同风险特征且临床意义明确的亚组,并揭示了呼吸系统合并症及医院水平变异在影响死亡率结局中的作用。该框架为层次数据环境下基于风险的患者分层提供了一种灵活且可解释的工具。