The advancement of biomedical research heavily relies on access to large amounts of medical data. In the case of histopathology, Whole Slide Images (WSI) and clinicopathological information are valuable for developing Artificial Intelligence (AI) algorithms for Digital Pathology (DP). Transferring medical data "as open as possible" enhances the usability of the data for secondary purposes but poses a risk to patient privacy. At the same time, existing regulations push towards keeping medical data "as closed as necessary" to avoid re-identification risks. Generally, these legal regulations require the removal of sensitive data but do not consider the possibility of data linkage attacks due to modern image-matching algorithms. In addition, the lack of standardization in DP makes it harder to establish a single solution for all formats of WSIs. These challenges raise problems for bio-informatics researchers in balancing privacy and progress while developing AI algorithms. This paper explores the legal regulations and terminologies for medical data-sharing. We review existing approaches and highlight challenges from the histopathological perspective. We also present a data-sharing guideline for histological data to foster multidisciplinary research and education.
翻译:生物医学研究的进步高度依赖于对大量医学数据的访问。在组织病理学领域,全切片图像(WSI)和临床病理信息对于开发数字病理学(DP)的人工智能(AI)算法具有重要价值。医疗数据“尽可能开放”的传输增强了数据在二次利用中的可用性,但同时也对患者隐私构成风险。与此同时,现有法规推动医疗数据“在必要时保持封闭”,以避免重新识别的风险。通常,这些法律要求移除敏感数据,但并未考虑因现代图像匹配算法可能导致的数据链接攻击。此外,数字病理学中缺乏标准化使得为所有格式的全切片图像建立统一解决方案更加困难。这些挑战给生物信息学研究人员在发展AI算法时平衡隐私与进展带来了问题。本文探讨了医学数据共享的法律法规与术语,回顾了现有方法,并从组织病理学视角强调了挑战。我们还提出了一项组织学数据共享指南,以促进多学科研究与教育。