This paper presents ChatDBG, the first AI-powered debugging assistant. ChatDBG integrates large language models (LLMs) to significantly enhance the capabilities and user-friendliness of conventional debuggers. ChatDBG lets programmers engage in a collaborative dialogue with the debugger, allowing them to pose complex questions about program state, perform root cause analysis for crashes or assertion failures, and explore open-ended queries like "why is x null?". To handle these queries, ChatDBG grants the LLM autonomy to take the wheel and drive debugging by issuing commands to navigate through stacks and inspect program state; it then reports its findings and yields back control to the programmer. Our ChatDBG prototype integrates with standard debuggers including LLDB, GDB, and WinDBG for native code and Pdb for Python. Our evaluation across a diverse set of code, including C/C++ code with known bugs and a suite of Python code including standalone scripts and Jupyter notebooks, demonstrates that ChatDBG can successfully analyze root causes, explain bugs, and generate accurate fixes for a wide range of real-world errors. For the Python programs, a single query led to an actionable bug fix 67% of the time; one additional follow-up query increased the success rate to 85%. ChatDBG has seen rapid uptake; it has already been downloaded nearly 30,000 times.
翻译:本文介绍ChatDBG,这是首款人工智能驱动的调试助手。ChatDBG集成了大语言模型(LLMs),显著增强了传统调试器的功能与用户友好性。该工具允许程序员与调试器进行协作式对话,使其能够针对程序状态提出复杂问题,对崩溃或断言失败进行根因分析,并探索诸如"为什么x是空值?"等开放式查询。为处理这些查询,ChatDBG赋予LLM自主权:通过发出命令导航堆栈并检查程序状态来驱动调试过程;随后汇报发现结果,并将控制权交还给程序员。我们的ChatDBG原型集成了标准调试器,包括面向原生代码的LLDB、GDB和WinDBG,以及面向Python的Pdb。在涵盖C/C++已知错误代码及包含独立脚本与Jupyter Notebook的Python代码套件等多样化代码上的评估表明,ChatDBG能够成功分析根因、解释错误,并为广泛的实际错误生成精准修复方案。针对Python程序,单次查询即可在67%的案例中生成可操作的错误修复;额外一次追问可将成功率提升至85%。ChatDBG已获得快速采用,其下载量已接近30,000次。