Despite the growing use of large language models (LLMs) for providing feedback, limited research has explored how to achieve high-quality feedback. This case study introduces an evaluation framework to assess different zero-shot prompt engineering methods. We varied the prompts systematically and analyzed the provided feedback on programming errors in R. The results suggest that prompts suggesting a stepwise procedure increase the precision, while omitting explicit specifications about which provided data to analyze improves error identification.
翻译:尽管大型语言模型(LLMs)在提供反馈方面的应用日益广泛,但如何实现高质量反馈的研究仍然有限。本案例研究引入了一个评估框架,用于评估不同的零样本提示工程方法。我们系统性地变换提示,并分析了针对R语言编程错误所提供反馈的结果。研究表明,采用分步流程的提示能够提高反馈的精确性,而省略对分析数据的明确说明则有助于提升错误识别的效果。