Large-scale password data breaches are becoming increasingly commonplace, which has enabled researchers to produce a substantial body of password security research utilising real-world password datasets, which often contain numbers of records in the tens or even hundreds of millions. While much study has been conducted on how password composition policies (sets of rules that a user must abide by when creating a password) influence the distribution of user-chosen passwords on a system, much less research has been done on inferring the password composition policy that a given set of user-chosen passwords was created under. In this paper, we state the problem with the naive approach to this challenge, and suggest a simple approach that produces more reliable results. We also present pol-infer, a tool that implements this approach, and demonstrates its use in inferring password composition policies.
翻译:大规模密码数据泄露事件日益常见,这使得研究人员能够利用包含数千万甚至数亿条记录的真实世界密码数据集,产出大量密码安全研究成果。虽然已有诸多研究聚焦于密码构成策略(用户创建密码时必须遵守的规则集)如何影响系统中用户自选密码的分布,但关于推断给定用户自选密码集合所依据的密码构成策略的研究却相对匮乏。本文阐述了该挑战中朴素方法存在的问题,并提出了一种能产生更可靠结果的简单方法。我们还介绍了实现该方法的工具pol-infer,并展示了其在密码构成策略推断中的应用。