Numerous knowledge workers utilize spreadsheets in business, accounting, and finance. However, a lack of systematic documentation methods for spreadsheets hinders automation, collaboration, and knowledge transfer, which risks the loss of crucial institutional knowledge. This paper introduces Spreadsheet Operations Documentation (SOD), an AI task that involves generating human-readable explanations from spreadsheet operations. Many previous studies have utilized Large Language Models (LLMs) for generating spreadsheet manipulation code; however, translating that code into natural language for SOD is a less-explored area. To address this, we present a benchmark of 111 spreadsheet manipulation code snippets, each paired with a corresponding natural language summary. We evaluate five LLMs, GPT-4o, GPT-4o-mini, LLaMA-3.3-70B, Mixtral-8x7B, and Gemma2-9B, using BLEU, GLEU, ROUGE-L, and METEOR metrics. Our findings suggest that LLMs can generate accurate spreadsheet documentation, making SOD a feasible prerequisite step toward enhancing reproducibility, maintainability, and collaborative workflows in spreadsheets, although there are challenges that need to be addressed.
翻译:众多知识工作者在商业、会计和金融领域广泛使用电子表格。然而,由于缺乏系统化的电子表格文档记录方法,自动化、协作与知识传递受到阻碍,这可能导致关键机构知识的流失。本文提出了电子表格操作文档化(SOD)这一人工智能任务,其核心是从电子表格操作中生成人类可读的解释说明。先前许多研究已利用大型语言模型(LLMs)生成电子表格操作代码;然而,将这些代码转化为自然语言以完成SOD任务的研究领域尚待深入探索。为此,我们构建了一个包含111个电子表格操作代码片段的基准数据集,每个片段均配有对应的自然语言摘要。我们使用BLEU、GLEU、ROUGE-L和METEOR指标评估了五种LLM模型:GPT-4o、GPT-4o-mini、LLaMA-3.3-70B、Mixtral-8x7B和Gemma2-9B。研究结果表明,尽管仍存在需要解决的挑战,但LLMs能够生成准确的电子表格文档,这使得SOD成为提升电子表格可复现性、可维护性与协作工作流程的一个可行前置步骤。