Using a total of 4,774 hospitals categorized as government, non-profit, and proprietary hospitals, this study provides the first measurement-based analysis of hospitals' websites and connects the findings with data breaches through a correlation analysis. We study the security attributes of three categories, collectively and in contrast, against domain name, content, and SSL certificate-level features. We find that each type of hospital has a distinctive characteristic of its utilization of domain name registrars, top-level domain distribution, and domain creation distribution, as well as content type and HTTP request features. Security-wise, and consistent with the general population of websites, only 1\% of government hospitals utilized DNSSEC, in contrast to 6\% of the proprietary hospitals. Alarmingly, we found that 25\% of the hospitals used plain HTTP, in contrast to 20\% in the general web population. Alarmingly too, we found that 8\%-84\% of the hospitals, depending on their type, had some malicious contents, which are mostly attributed to the lack of maintenance. We conclude with a correlation analysis against 414 confirmed and manually vetted hospitals' data breaches. Among other interesting findings, our study highlights that the security attributes highlighted in our analysis of hospital websites are forming a very strong indicator of their likelihood of being breached. Our analyses are the first step towards understanding patient online privacy, highlighting the lack of basic security in many hospitals' websites and opening various potential research directions.
翻译:本研究基于共计4,774家医院(分为政府、非营利和营利三类),首次通过测量分析对医院网站展开研究,并通过相关性分析将发现与数据泄露事件相关联。我们从域名、内容和SSL证书层面的特征出发,对三类医院的安全属性进行整体性与对比性研究。研究发现,每类医院在域名注册商选用、顶级域名分布、域名创建分布、内容类型及HTTP请求特征方面均呈现独特特征。在安全性方面,与一般网站群体一致,仅1%的政府医院使用了DNSSEC,而营利性医院比例为6%。令人担忧的是,我们发现25%的医院使用明文HTTP,而一般网站群体中该比例为20%。同样值得注意的是,8%至84%的医院(具体比例因类型而异)存在恶意内容,这主要归因于维护缺失。最后,我们对414起经确认并人工核验的医院数据泄露事件进行了相关性分析。在其他有趣发现中,本研究强调,医院网站中凸显的安全属性已成为其被入侵可能性的强有力预测指标。我们的分析是理解患者在线隐私的第一步,揭示了众多医院网站缺乏基本安全性,并开辟了多种潜在的研究方向。