Searching through large volumes of medical data to retrieve relevant information is a challenging yet crucial task for clinical care. However the primitive and most common approach to retrieval, involving text in the form of keywords, is severely limited when dealing with complex media formats. Content-based retrieval offers a way to overcome this limitation, by using rich media as the query itself. Surgical video-to-video retrieval in particular is a new and largely unexplored research problem with high clinical value, especially in the real-time case: using real-time video hashing, search can be achieved directly inside of the operating room. Indeed, the process of hashing converts large data entries into compact binary arrays or hashes, enabling large-scale search operations at a very fast rate. However, due to fluctuations over the course of a video, not all bits in a given hash are equally reliable. In this work, we propose a method capable of mitigating this uncertainty while maintaining a light computational footprint. We present superior retrieval results (3-4 % top 10 mean average precision) on a multi-task evaluation protocol for surgery, using cholecystectomy phases, bypass phases, and coming from an entirely new dataset introduced here, critical events across six different surgery types. Success on this multi-task benchmark shows the generalizability of our approach for surgical video retrieval.
翻译:在大量医学数据中检索相关信息是一项具有挑战性但对临床诊疗至关重要的任务。然而,处理复杂媒体格式时,基于关键词文本的原始且最常见的检索方法存在严重局限性。基于内容的检索通过将丰富的媒体本身作为查询对象,能够克服这一限制。其中,外科手术视频到视频的检索是一个新颖且尚未充分探索但具有高临床价值的研究问题,尤其在实时场景中:通过实时视频哈希,可直接在手术室内部实现搜索。实际上,哈希过程将大型数据条目转换为紧凑的二进制数组或哈希值,从而能够以极快速度执行大规模搜索操作。然而,由于视频过程中的波动,给定哈希中的部分比特位并不可靠。本研究提出了一种能够在保持较低计算开销的同时缓解这种不确定性的方法。我们通过基于多任务评估协议(涵盖胆囊切除术阶段、搭桥手术阶段以及本文首次引入的全新数据集中的六种手术类型的关键事件)进行的实验,展示了优越的检索结果(top-10平均精度提升3-4%)。在多任务基准测试中的成功验证了我们的方法在外科手术视频检索中的泛化能力。