Symbols (or more broadly, non-natural language textual representations) such as numerical sequences, molecular formulas, and table delimiters widely exist, playing important roles in various tasks such as abstract reasoning, chemical property prediction, and table question answering. Despite the impressive natural language comprehension capabilities of large language models (LLMs), their reasoning abilities for symbols remain inadequate, which could attributed to the difference between symbol representations and general natural languages. We propose symbol-to-language (S2L), a tuning-free method that enables large language models to solve symbol-related problems with information expressed in natural language. Specifically, S2L first converts the symbols involved to language-based representations, which can be implemented by prompting LLMs or leveraging external tools, then these language-based representations are integrated into the original problem via direct substitution or concatenation, serving as useful input information for LLMs. We evaluate the S2L method using both API-based (GPT-4, ChatGPT) and open-source (OpenChat) models over eight symbol-related tasks, ranging from symbol-only abstract reasoning to sentiment analysis in social media. Experimental results show that S2L consistently leads to superior performance. For example, by employing S2L for GPT-4, there can be average significant improvements of +21.9% and +9.5% for subtasks in 1D-ARC and Dyck language, respectively. Codes and data are available at https://github.com/THUNLP-MT/symbol2language.
翻译:符号(或更广义的非自然语言文本表征)如数值序列、分子式及表格分隔符广泛存在,在抽象推理、化学性质预测及表格问答等任务中发挥着重要作用。尽管大语言模型具有卓越的自然语言理解能力,但其对符号的推理能力仍显不足,这归因于符号表征与自然语言之间的本质差异。本文提出符号到语言转换方法,一种无需微调即可使大语言模型通过自然语言表达信息解决符号相关问题的框架。具体而言,S2L首先将符号转换为基于语言的表征(可通过提示大语言模型或调用外部工具实现),随后通过直接替换或拼接方式将此类语言表征融入原始问题,作为大语言模型的有效输入信息。我们采用API型模型(GPT-4、ChatGPT)与开源模型(OpenChat)在八项符号相关任务上评估S2L方法,涵盖纯符号抽象推理到社交媒体情感分析。实验结果表明,S2L持续取得更优性能。例如,在1D-ARC与Dyck语言的子任务中,采用S2L的GPT-4分别取得平均+21.9%与+9.5%的显著提升。相关代码与数据已开源至https://github.com/THUNLP-MT/symbol2language。