While Large Language Models (LLMs) have demonstrated their proficiency in complex reasoning tasks, their performance in dynamic, interactive, and competitive scenarios - such as business strategy and stock market analysis - remains underexplored. To bridge this gap, we formally explore the dynamic reasoning capabilities of LLMs for decision-making in rapidly evolving environments. We introduce two game theory-based pilot challenges that mirror the complexities of real-world dynamic decision-making. These challenges are well-defined, enabling clear, controllable, and precise evaluation of LLMs' dynamic reasoning abilities. Through extensive experiments, we find that existing reasoning methods tend to falter in dynamic settings that require k-level thinking - a key concept not tackled by previous works. To address this, we propose a novel reasoning approach for LLMs, named "K-Level Reasoning". This approach adopts the perspective of rivals to recursively employ k-level thinking based on available historical information, which significantly improves the prediction accuracy of rivals' subsequent moves and informs more strategic decision-making. This research not only sets a robust quantitative benchmark for the assessment of dynamic reasoning but also markedly enhances the proficiency of LLMs in dynamic contexts.
翻译:尽管大语言模型(LLM)在复杂推理任务中展现出卓越能力,但它们在动态、交互与竞争场景(如商业策略和股票市场分析)中的表现仍有待探索。为弥合这一差距,我们系统研究了LLM在快速变化环境中的动态决策推理能力。我们引入两个基于博弈论的先导挑战任务,这些任务模拟了现实世界中动态决策的复杂性。这些挑战具有良好定义,能够对LLM的动态推理能力进行清晰、可控且精确的评估。通过大量实验,我们发现现有推理方法在需要K级思维(这一关键概念此前未被研究工作涉及)的动态场景中往往表现不佳。为此,我们提出了一种名为"K级推理"的新型LLM推理方法。该方法通过采用对手视角,基于可用历史信息递归运用K级思维,显著提升了对手后续行动预测的准确率,并支撑更具战略性的决策制定。本研究不仅为动态推理评估建立了稳健的定量基准,更显著增强了LLM在动态情境下的能力水平。