Transport Layer Security (TLS) is fundamental to secure online communication, yet vulnerabilities in certificate validation that enable Man-in-the-Middle (MitM) attacks remain a pervasive threat in Android apps. Existing detection tools are hampered by low-coverage UI interaction, costly instrumentation, and a lack of scalable root-cause analysis. We present Okara, a framework that leverages foundation models to automate the detection and deep attribution of TLS MitM Vulnerabilities (TMVs). Okara's detection component, TMV-Hunter, employs foundation model-driven GUI agents to achieve high-coverage app interaction, enabling efficient vulnerability discovery at scale. Deploying TMV-Hunter on 37,349 apps from Google Play and a third-party store revealed 8,374 (22.42%) vulnerable apps. Our measurement shows these vulnerabilities are widespread across all popularity levels, affect critical functionalities like authentication and code delivery, and are highly persistent with a median vulnerable lifespan of over 1,300 days. Okara's attribution component, TMV-ORCA, combines dynamic instrumentation with a novel LLM-based classifier to locate and categorize vulnerable code according to a comprehensive new taxonomy. This analysis attributes 41% of vulnerabilities to third-party libraries and identifies recurring insecure patterns, such as empty trust managers and flawed hostname verification. We have initiated a large-scale responsible disclosure effort and will release our tools and datasets to support further research and mitigation.
翻译:传输层安全(TLS)是确保在线通信安全的基础,然而,证书验证中的漏洞所导致的中间人攻击(MitM)在Android应用中仍然是一个普遍存在的威胁。现有检测工具受限于低覆盖率的用户界面交互、高成本的插桩技术,以及缺乏可扩展的根因分析。本文提出了Okara框架,该框架利用基础模型来自动化检测并深度归因TLS中间人漏洞。Okara的检测组件TMV-Hunter采用基础模型驱动的图形用户界面代理,实现了高覆盖率的应用交互,从而能够大规模高效地发现漏洞。在来自Google Play和第三方应用商店的37,349个应用上部署TMV-Hunter,共发现8,374个(22.42%)存在漏洞的应用。我们的测量结果表明,这些漏洞在所有流行度级别的应用中普遍存在,影响了诸如身份验证和代码交付等关键功能,并且具有高度持久性,漏洞存在的中位寿命超过1,300天。Okara的归因组件TMV-ORCA结合了动态插桩技术与一种新颖的基于大语言模型的分类器,能够根据一个全面且新颖的分类法来定位和分类存在漏洞的代码。该分析将41%的漏洞归因于第三方库,并识别出反复出现的不安全模式,例如空信任管理器以及存在缺陷的主机名验证。我们已经启动了一项大规模负责任披露工作,并将发布我们的工具和数据集,以支持进一步的研究和缓解工作。