We evaluate the performance of four leading solutions for de-identification of unstructured medical text - Azure Health Data Services, AWS Comprehend Medical, OpenAI GPT-4o, and John Snow Labs - on a ground truth dataset of 48 clinical documents annotated by medical experts. The analysis, conducted at both entity-level and token-level, suggests that John Snow Labs' Medical Language Models solution achieves the highest accuracy, with a 96% F1-score in protected health information (PHI) detection, outperforming Azure (91%), AWS (83%), and GPT-4o (79%). John Snow Labs is not only the only solution which achieves regulatory-grade accuracy (surpassing that of human experts) but is also the most cost-effective solution: It is over 80% cheaper compared to Azure and GPT-4o, and is the only solution not priced by token. Its fixed-cost local deployment model avoids the escalating per-request fees of cloud-based services, making it a scalable and economical choice.
翻译:我们基于由医学专家标注的48份临床文档真实数据集,评估了四种主流非结构化医疗文本去标识化解决方案的性能——Azure Health Data Services、AWS Comprehend Medical、OpenAI GPT-4o与John Snow Labs。在实体层面和词元层面的分析表明,John Snow Labs的医学语言模型解决方案以96%的受保护健康信息检测F1分数取得最高准确率,优于Azure(91%)、AWS(83%)和GPT-4o(79%)。John Snow Labs不仅是唯一达到监管级准确度(超越人类专家水平)的解决方案,同时具备最优成本效益:其费用较Azure和GPT-4o降低超80%,且是唯一不按词元计价的方案。其固定成本的本地部署模式避免了云端服务按请求递增的计费方式,使之成为可扩展且经济高效的选择。