In recent years, large language models (LLMs) have shown an impressive ability to perform arithmetic and symbolic reasoning tasks. However, we found that LLMs (e.g., ChatGPT) cannot perform well on reasoning that requires multiple rounds of dialogue, especially when solving situation puzzles. Specifically, LLMs intend to ask very detailed questions focusing on a specific aspect or same/similar questions after several rounds of Q&As. To help LLMs get out of the above dilemma, we propose a novel external reformulation methodology, where the situation puzzle will be reformulated after several rounds of Q&A or when the LLMs raise an incorrect guess. Experiments show superior performance (e.g., win rate, number of question/guess attempts) of our method than directly using LLMs for solving situation puzzles, highlighting the potential of strategic problem reformulation to enhance the reasoning capabilities of LLMs in complex interactive scenarios.
翻译:近年来,大型语言模型(LLMs)在执行算术与符号推理任务方面展现出令人瞩目的能力。然而,我们发现LLMs(例如ChatGPT)在需要多轮对话的推理任务上表现不佳,尤其在解决情境谜题时尤为明显。具体而言,LLMs倾向于在数轮问答后提出过于关注特定方面的细节问题,或重复提出相同/相似的问题。为帮助LLMs摆脱上述困境,我们提出一种新颖的外部重构方法:在数轮问答后或当LLMs提出错误猜测时,对情境谜题进行重新表述。实验表明,相较于直接使用LLMs解决情境谜题,我们的方法在胜率、提问/猜测尝试次数等指标上均表现出优越性能,凸显了策略性问题的重构在增强LLMs于复杂交互场景中推理能力的潜力。