Bamboo：基于大语言模型的Android框架API-权限映射发现 (Bamboo: LLM-Driven Discovery of API-Permission Mappings in the Android Framework)

The permission mechanism in the Android Framework is integral to safeguarding the privacy of users by managing users' and processes' access to sensitive resources and operations. As such, developers need to be equipped with an in-depth understanding of API permissions to build robust Android apps. Unfortunately, the official API documentation by Android chronically suffers from imprecision and incompleteness, causing developers to spend significant effort to accurately discern necessary permissions. This potentially leads to incorrect permission declarations in Android app development, potentially resulting in security violations and app failures. Recent efforts in improving permission specification primarily leverage static and dynamic code analyses to uncover API-permission mappings within the Android framework. Yet, these methodologies encounter substantial shortcomings, including poor adaptability to Android SDK and Framework updates, restricted code coverage, and a propensity to overlook essential API-permission mappings in intricate codebases. This paper introduces a pioneering approach utilizing large language models (LLMs) for a systematic examination of API-permission mappings. In addition to employing LLMs, we integrate a dual-role prompting strategy and an API-driven code generation approach into our mapping discovery pipeline, resulting in the development of the corresponding tool, \tool{}. We formulate three research questions to evaluate the efficacy of \tool{} against state-of-the-art baselines, assess the completeness of official SDK documentation, and analyze the evolution of permission-required APIs across different SDK releases. Our experimental results reveal that \tool{} identifies 2,234, 3,552, and 4,576 API-permission mappings in Android versions 6, 7, and 10 respectively, substantially outprforming existing baselines.

翻译：Android框架中的权限机制通过管理用户和进程对敏感资源与操作的访问，对保护用户隐私至关重要。因此，开发者需要深入理解API权限以构建健壮的Android应用。然而，Android官方API文档长期存在不精确和不完整的问题，导致开发者需耗费大量精力才能准确识别所需权限。这可能在Android应用开发中引发错误的权限声明，进而导致安全漏洞和应用故障。近期改进权限规范的研究主要利用静态和动态代码分析来揭示Android框架内的API-权限映射关系。但这些方法存在明显缺陷，包括对Android SDK和框架更新的适应性差、代码覆盖率有限，以及在复杂代码库中容易遗漏关键API-权限映射。本文提出了一种利用大语言模型（LLMs）系统化检测API-权限映射的创新方法。除了运用LLMs，我们还将双角色提示策略和API驱动的代码生成方法整合到映射发现流程中，并据此开发了相应工具\tool{}。我们提出了三个研究问题，以评估\tool{}相对于现有先进基线的效能、检验官方SDK文档的完整性，并分析不同SDK版本中需权限API的演化规律。实验结果表明，\tool{}在Android 6、7和10版本中分别识别出2,234、3,552和4,576个API-权限映射，显著超越了现有基线方法。