Self-correction in text-to-SQL is the process of prompting large language model (LLM) to revise its previously incorrectly generated SQL, and commonly relies on manually crafted self-correction guidelines by human experts that are not only labor-intensive to produce but also limited by the human ability in identifying all potential error patterns in LLM responses. We introduce MAGIC, a novel multi-agent method that automates the creation of the self-correction guideline. MAGIC uses three specialized agents: a manager, a correction, and a feedback agent. These agents collaborate on the failures of an LLM-based method on the training set to iteratively generate and refine a self-correction guideline tailored to LLM mistakes, mirroring human processes but without human involvement. Our extensive experiments show that MAGIC's guideline outperforms expert human's created ones. We empirically find out that the guideline produced by MAGIC enhance the interpretability of the corrections made, providing insights in analyzing the reason behind the failures and successes of LLMs in self-correction. We make all agent interactions publicly available to the research community, to foster further research in this area, offering a synthetic dataset for future explorations into automatic self-correction guideline generation.
翻译:文本到SQL中的自修正是指提示大型语言模型(LLM)修正其先前错误生成的SQL的过程,通常依赖于人工专家手动构建的自修正指南。这种方法不仅制作过程劳动密集,而且受限于人类识别LLM响应中所有潜在错误模式的能力。我们提出了MAGIC,一种新颖的多智能体方法,用于自动化创建自修正指南。MAGIC使用三个专门设计的智能体:一个管理智能体、一个修正智能体和一个反馈智能体。这些智能体基于LLM方法在训练集上的失败案例进行协作,迭代生成并优化针对LLM错误定制的自修正指南,模拟了人类流程但无需人工参与。我们的大量实验表明,MAGIC生成的指南优于专家人工创建的指南。我们通过实证发现,MAGIC生成的指南增强了修正的可解释性,为分析LLM在自修正中失败与成功的原因提供了见解。我们将所有智能体交互过程公开提供给研究社区,以促进该领域的进一步研究,为未来探索自动自修正指南生成提供了一个合成数据集。