Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer

Purpose: Advances in deep learning have resulted in effective models for surgical video analysis; however, these models often fail to generalize across medical centers due to domain shift caused by variations in surgical workflow, camera setups, and patient demographics. Recently, object-centric learning has emerged as a promising approach for improved surgical scene understanding, capturing and disentangling visual and semantic properties of surgical tools and anatomy to improve downstream task performance. In this work, we conduct a multi-centric performance benchmark of object-centric approaches, focusing on Critical View of Safety assessment in laparoscopic cholecystectomy, then propose an improved approach for unseen domain generalization. Methods: We evaluate four object-centric approaches for domain generalization, establishing baseline performance. Next, leveraging the disentangled nature of object-centric representations, we dissect one of these methods through a series of ablations (e.g. ignoring either visual or semantic features for downstream classification). Finally, based on the results of these ablations, we develop an optimized method specifically tailored for domain generalization, LG-DG, that includes a novel disentanglement loss function. Results: Our optimized approach, LG-DG, achieves an improvement of 9.28% over the best baseline approach. More broadly, we show that object-centric approaches are highly effective for domain generalization thanks to their modular approach to representation learning. Conclusion: We investigate the use of object-centric methods for unseen domain generalization, identify method-agnostic factors critical for performance, and present an optimized approach that substantially outperforms existing methods.

翻译：目的：深度学习的最新进展已为手术视频分析提供了有效模型；然而，由于手术流程、摄像头设置和患者群体差异导致的域偏移，这些模型常无法在医疗中心间泛化。近期，以对象为中心的学习作为提升手术场景理解的一种有前景的方法出现，其通过捕获并解耦手术器械与解剖结构的视觉及语义属性，改善下游任务性能。本研究针对腹腔镜胆囊切除术中的安全关键视图评估，开展多中心对象中心方法性能基准测试，并提出一种面向未知域泛化的改进方法。方法：我们评估四种对象中心方法在域泛化中的性能，建立基线基准。随后，利用对象中心表征的解耦特性，通过一系列消融实验（如下游分类中忽略视觉或语义特征）剖析其中一种方法。最终基于消融结果，开发一种专为域泛化优化的方法LG-DG，其包含一种新颖的解耦损失函数。结果：优化方法LG-DG较最佳基线方法提升9.28%。更广泛而言，研究表明对象中心方法因其表征学习的模块化特性，在域泛化中表现出高度有效性。结论：我们探究对象中心方法在未知域泛化中的应用，识别影响性能的方法无关关键因素，并提出一种显著优于现有方法的优化方案。