企业应用中元数据相关缺陷的检测 (Detecting Metadata-Related Bugs in Enterprise Applications)

When building enterprise applications (EAs) on Java frameworks (e.g., Spring), developers often configure application components via metadata (i.e., Java annotations and XML files). It is challenging for developers to correctly use metadata, because the usage rules can be complex and existing tools provide limited assistance. When developers misuse metadata, EAs become misconfigured, which defects can trigger erroneous runtime behaviors or introduce security vulnerabilities. To help developers correctly use metadata, this paper presents (1) RSL -- a domain-specific language that domain experts can adopt to prescribe metadata checking rules, and (2) MeCheck -- a tool that takes in RSL rules and EAs to check for rule violations. With RSL, domain experts (e.g., developers of a Java framework) can specify metadata checking rules by defining content consistency among XML files, annotations, and Java code. Given such RSL rules and a program to scan, MeCheck interprets rules as cross-file static analyzers, which analyzers scan Java and/or XML files to gather information and look for consistency violations. For evaluation, we studied the Spring and JUnit documentation to manually define 15 rules, and created 2 datasets with 115 open-source EAs. The first dataset includes 45 EAs, and the ground truth of 45 manually injected bugs. The second dataset includes multiple versions of 70 EAs. We observed that MeCheck identified bugs in the first dataset with 100% precision, 96% recall, and 98% F-score. It reported 156 bugs in the second dataset, 53 of which bugs were already fixed by developers. Our evaluation shows that MeCheck helps ensure the correct usage of metadata.

翻译：在基于Java框架（如Spring）构建企业应用（EA）时，开发者常通过元数据（即Java注解与XML文件）配置应用组件。由于使用规则可能较为复杂且现有工具支持有限，开发者正确使用元数据面临挑战。当开发者误用元数据时，将导致企业应用配置错误，此类缺陷可能引发运行时行为异常或引入安全漏洞。为协助开发者正确使用元数据，本文提出：（1）RSL——一种领域特定语言，可供领域专家用于规定元数据检查规则；（2）MeCheck——一种接收RSL规则与企业应用并检查规则违反情况的工具。通过RSL，领域专家（如Java框架开发者）可通过定义XML文件、注解与Java代码之间的内容一致性来指定元数据检查规则。给定此类RSL规则与待扫描程序，MeCheck将规则解释为跨文件静态分析器，这些分析器通过扫描Java及/或XML文件收集信息并检测一致性违反情况。为进行评估，我们研究Spring与JUnit文档手动定义了15条规则，并创建包含115个开源企业应用的两个数据集。第一个数据集包含45个企业应用及45个手动注入缺陷的真实基准；第二个数据集包含70个企业应用的多个版本。实验表明，MeCheck在第一个数据集中以100%精确率、96%召回率与98% F值识别缺陷；在第二个数据集中报告了156个缺陷，其中53个已被开发者修复。评估结果证明MeCheck能有效保障元数据的正确使用。