The popularity of text-based CAPTCHA as a security mechanism to protect websites from automated bots has prompted researches in CAPTCHA solvers, with the aim of understanding its failure cases and subsequently making CAPTCHAs more secure. Recently proposed solvers, built on advances in deep learning, are able to crack even the very challenging CAPTCHAs with high accuracy. However, these solvers often perform poorly on out-of-distribution samples that contain visual features different from those in the training set. Furthermore, they lack the ability to detect and avoid such samples, making them susceptible to being locked out by defense systems after a certain number of failed attempts. In this paper, we propose EnSolver, a novel CAPTCHA solver that utilizes deep ensemble uncertainty estimation to detect and skip out-of-distribution CAPTCHAs, making it harder to be detected. We demonstrate the use of our solver with object detection models and show empirically that it performs well on both in-distribution and out-of-distribution data, achieving up to 98.1% accuracy when detecting out-of-distribution data and up to 93% success rate when solving in-distribution CAPTCHAs.
翻译:基于文本的CAPTCHA作为保护网站免受自动化机器人攻击的安全机制,其广泛使用推动了针对CAPTCHA破解器的研究,旨在理解其失效案例并进而增强CAPTCHA的安全性。近年来基于深度学习进展提出的破解器,即使对极具挑战性的CAPTCHA也能实现高精度破解。然而,这些破解器在含有与训练集不同视觉特征的分布外样本上表现通常较差。此外,它们缺乏检测并规避此类样本的能力,容易因多次破解失败而被防御系统锁定。本文提出EnSolver——一种新型CAPTCHA破解器,它利用深度集成不确定性估计来检测并跳过分布外CAPTCHA,从而更难被防御系统察觉。我们使用目标检测模型展示了该破解器的应用,并通过实验证明其在分布内和分布外数据上均表现优异:对分布外数据的检测准确率高达98.1%,对分布内CAPTCHA的破解成功率最高可达93%。