In this work, we explore idiomatic language processing with Large Language Models (LLMs). We introduce the Idiomatic language Test Suite IdioTS, a new dataset of difficult examples specifically designed by language experts to assess the capabilities of LLMs to process figurative language at sentence level. We propose a comprehensive evaluation methodology based on an idiom detection task, where LLMs are prompted with detecting an idiomatic expression in a given English sentence. We present a thorough automatic and manual evaluation of the results and an extensive error analysis.
翻译:本文探索了使用大语言模型(LLMs)处理习语语言的研究。我们提出了习语语言测试集IdioTS,这是一个由语言专家专门设计的包含困难例句的新数据集,用于评估大语言模型在句子层面处理比喻语言的能力。我们提出了一种基于习语检测任务的综合评估方法,该方法通过提示大语言模型检测给定英语句子中的习语表达。我们对结果进行了详尽的自动与人工评估,并开展了全面的错误分析。