The advancement of biomedical research heavily relies on access to large amounts of medical data. In the case of histopathology, Whole Slide Images (WSI) and clinicopathological information are valuable for developing Artificial Intelligence (AI) algorithms for Digital Pathology (DP). Transferring medical data "as open as possible" enhances the usability of the data for secondary purposes but poses a risk to patient privacy. At the same time, existing regulations push towards keeping medical data "as closed as necessary" to avoid re-identification risks. Generally, these legal regulations require the removal of sensitive data but do not consider the possibility of data linkage attacks due to modern image-matching algorithms. In addition, the lack of standardization in DP makes it harder to establish a single solution for all formats of WSIs. These challenges raise problems for bio-informatics researchers in balancing privacy and progress while developing AI algorithms. This paper explores the legal regulations and terminologies for medical data-sharing. We review existing approaches and highlight challenges from the histopathological perspective. We also present a data-sharing guideline for histological data to foster multidisciplinary research and education.
翻译:生物医学研究的进步高度依赖大量医疗数据的获取。在组织病理学领域,全切片图像(WSI)及临床病理信息对开发数字病理学(DP)中的人工智能(AI)算法具有重要价值。将医疗数据"尽可能开放"地传输可增强其二次利用的实用性,但同时也对患者隐私构成风险。与此同时,现行法规倾向于"必要时封闭"医疗数据,以避免身份重新识别的风险。通常,这些法律规定要求移除敏感数据,但未考虑现代图像匹配算法导致的数据链接攻击可能性。此外,数字病理学缺乏标准化,使得为所有格式的WSI建立统一解决方案更为困难。这些挑战给生物信息学研究人员在开发AI算法时平衡隐私与进步带来了难题。本文探讨了医疗数据共享的法律法规与术语,从组织病理学角度回顾了现有方法并突出了挑战,还提出了适用于组织学数据的数据共享指南,以促进多学科研究与教育。