代码智能论文 - 专知

会员服务 ·

代码智能

Blueprint First, Model Second: A Framework for Deterministic LLM Workflow

Arxiv

0+阅读 · 6月16日

CODA-BENCH: Can Code Agents Handle Data-Intensive Tasks?

Arxiv

0+阅读 · 6月13日

Asuka-Bench: Benchmarking Code Agents on Underspecified User Intent and Multi-Round Refinement

Arxiv

0+阅读 · 6月4日

SecureVibeBench: Benchmarking Secure Vibe Coding of AI Agents via Reconstructing Vulnerability-Introducing Scenarios

Arxiv

0+阅读 · 6月6日

Coherence Collapse: Diagnosing Why Code Agents Fail After Reaching the Right Code

Arxiv

0+阅读 · 5月26日

RepoMirage: Probing Repository Context Reasoning in Code Agents with Perturbations

Arxiv

0+阅读 · 5月25日

CRANE: Constrained Reasoning Injection for Code Agents via Nullspace Editing

Arxiv

0+阅读 · 6月9日

BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?

Arxiv

0+阅读 · 5月26日

On the Road to Personalized Code Intelligence: Portraiting and Assisting Developers Based on Their In-IDE Behaviors

Arxiv

0+阅读 · 5月28日

MoCA-Agent: A Market-of-Claims Code Agent for Financial and Numerical Reasoning

Arxiv

0+阅读 · 6月10日

Is Compression Really Linear with Code Intelligence?

Arxiv

0+阅读 · 3月26日

Large Language Models for Multilingual Code Intelligence: A Survey

Arxiv

0+阅读 · 4月27日

Do Code LLMs Do Static Analysis?

Arxiv

0+阅读 · 3月26日

Automatically Benchmarking LLM Code Agents through Agent-Driven Annotation and Evaluation

Arxiv

0+阅读 · 3月16日

Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?

Arxiv

0+阅读 · 3月3日

参考链接

微信扫码咨询专知VIP会员