This paper critically examines the 2022 Medibank health insurance data breach, which exposed sensitive medical records of 9.7 million individuals due to unencrypted storage, centralized access, and the absence of privacy-preserving analytics. To address these vulnerabilities, we propose an entropy-aware differential privacy (DP) framework that integrates Laplace and Gaussian mechanisms with adaptive budget allocation. The design incorporates TLS-encrypted database access, field-level mechanism selection, and smooth sensitivity models to mitigate re-identification risks. Experimental validation was conducted using synthetic Medibank datasets (N = 131,000) with entropy-calibrated DP mechanisms, where high-entropy attributes received stronger noise injection. Results demonstrate a 90.3% reduction in re-identification probability while maintaining analytical utility loss below 24%. The framework further aligns with GDPR Article 32 and Australian Privacy Principle 11.1, ensuring regulatory compliance. By combining rigorous privacy guarantees with practical usability, this work contributes a scalable and technically feasible solution for healthcare data protection, offering a pathway toward resilient, trustworthy, and regulation-ready medical analytics.
翻译:本文对2022年Medibank健康保险数据泄露事件进行了批判性审视,该事件因未加密存储、集中式访问及缺乏隐私保护分析机制,导致970万个人的敏感医疗记录遭到泄露。为应对这些安全漏洞,我们提出了一种熵感知差分隐私框架,该框架将拉普拉斯机制与高斯机制相结合,并采用自适应预算分配策略。该设计整合了TLS加密数据库访问、字段级机制选择以及平滑敏感度模型,以降低重识别风险。我们使用合成的Medibank数据集(N = 131,000)配合熵校准差分隐私机制进行了实验验证,其中高熵属性被注入了更强的噪声。结果表明,在保持分析效用损失低于24%的同时,重识别概率降低了90.3%。该框架进一步符合《通用数据保护条例》第32条及《澳大利亚隐私原则》第11.1款,确保了法规遵从性。通过将严格的隐私保证与实际可用性相结合,本研究为医疗健康数据保护提供了一种可扩展且技术可行的解决方案,为构建具有韧性、可信且符合监管要求的医疗分析系统开辟了路径。