The evaluation of Large Language Models (LLMs) for code generation relies heavily on the quality and robustness of test cases. However, existing benchmarks often lack coverage for subtle corner cases, allowing incorrect solutions to pass. To bridge this gap, we propose CodeHacker, an automated agent framework dedicated to generating targeted adversarial test cases that expose latent vulnerabilities in program submissions. Mimicking the hack mechanism in competitive programming, CodeHacker employs a multi-strategy approach, including stress testing, anti-hash attacks, and logic-specific targeting to break specific code submissions. To ensure the validity and reliability of these attacks, we introduce a Calibration Phase, where the agent iteratively refines its own Validator and Checker via self-generated adversarial probes before evaluating contestant code.Experiments demonstrate that CodeHacker significantly improves the True Negative Rate (TNR) of existing datasets, effectively filtering out incorrect solutions that were previously accepted. Furthermore, generated adversarial cases prove to be superior training data, boosting the performance of RL-trained models on benchmarks like LiveCodeBench.
翻译:大语言模型代码生成能力的评估高度依赖测试用例的质量与鲁棒性。然而,现有基准测试往往缺乏对微妙边界条件的覆盖,导致错误解决方案得以通过。为弥补这一不足,我们提出CodeHacker——一个专为生成定向对抗性测试用例以暴露程序提交中潜在漏洞的自动化智能体框架。该框架模拟竞赛编程中的攻击机制,采用包括压力测试、反哈希攻击及逻辑定向攻击在内的多策略手段破解特定代码提交。为确保攻击的有效性与可靠性,我们引入校准阶段:智能体通过自生成的对抗性探测,在评估参赛代码之前迭代优化自身的验证器与检查器。实验表明,CodeHacker显著提升了现有数据集的真负率,有效过滤了先前通过的错误解决方案。此外,生成的对抗性用例被证明是优质训练数据,可提升基于强化学习训练的模型在LiveCodeBench等基准测试中的性能表现。