Large Language Models (LLMs) have shown promising results in automatic code generation by improving coding efficiency to a certain extent. However, generating high-quality and reliable code remains a formidable task because of LLMs' lack of good programming practice, especially in exception handling. In this paper, we first conduct an empirical study and summarise three crucial challenges of LLMs in exception handling, i.e., incomplete exception handling, incorrect exception handling and abuse of try-catch. We then try prompts with different granularities to address such challenges, finding fine-grained knowledge-driven prompts works best. Based on our empirical study, we propose a novel Knowledge-driven Prompt Chaining-based code generation approach, name KPC, which decomposes code generation into an AI chain with iterative check-rewrite steps and chains fine-grained knowledge-driven prompts to assist LLMs in considering exception-handling specifications. We evaluate our KPC-based approach with 3,079 code generation tasks extracted from the Java official API documentation. Extensive experimental results demonstrate that the KPC-based approach has considerable potential to ameliorate the quality of code generated by LLMs. It achieves this through proficiently managing exceptions and obtaining remarkable enhancements of 109.86% and 578.57% with static evaluation methods, as well as a reduction of 18 runtime bugs in the sampled dataset with dynamic validation.
翻译:大型语言模型(LLMs)通过提升编码效率,在自动代码生成方面已展现出可观前景。然而,由于LLMs缺乏良好的编程实践,尤其是在异常处理方面,生成高质量且可靠的代码仍是一项艰巨任务。本文首先开展实证研究,归纳出LLMs在异常处理中的三个关键挑战:不完整异常处理、错误异常处理以及try-catch滥用。随后,我们尝试不同粒度的提示词应对这些挑战,发现细粒度知识驱动提示词效果最佳。基于实证研究,我们提出名为KPC的新型知识驱动提示链式代码生成方法,该方法将代码生成分解为包含迭代检查-重写步骤的人工智能链,并通过串联细粒度知识驱动提示词,帮助LLMs考虑异常处理规范。我们使用从Java官方API文档提取的3,079个代码生成任务评估KPC方法。大量实验结果表明,KPC方法在提升LLMs生成代码质量方面具有显著潜力。通过熟练管理异常,该方法在静态评估中实现了109.86%和578.57%的显著提升,并在动态验证的采样数据集中减少了18个运行时错误。