The emergence of large language models (LLMs) has substantially influenced natural language processing, demonstrating exceptional results across various tasks. In this study, we employ ``Introspective Tips" to facilitate LLMs in self-optimizing their decision-making. By introspectively examining trajectories, LLM refines its policy by generating succinct and valuable tips. Our method enhances the agent's performance in both few-shot and zero-shot learning situations by considering three essential scenarios: learning from the agent's past experiences, integrating expert demonstrations, and generalizing across diverse games. Importantly, we accomplish these improvements without fine-tuning the LLM parameters; rather, we adjust the prompt to generalize insights from the three aforementioned situations. Our framework not only supports but also emphasizes the advantage of employing LLM in in-contxt decision-making. Experiments involving over 100 games in TextWorld illustrate the superior performance of our approach.
翻译:大规模语言模型(LLMs)的出现对自然语言处理产生了深远影响,在多种任务中展现出卓越性能。本研究采用"内省提示"方法促进LLM自主优化决策过程:LLM通过内省式分析轨迹,生成简洁有价值的提示以改进其策略。我们的方法综合考虑三种关键场景——从智能体过往经验中学习、融合专家示范、跨游戏进行泛化——从而在少样本和零样本学习情境中提升智能体表现。值得强调的是,这些改进无需微调LLM参数,仅通过调整提示机制即可实现对前述三种场景洞察的泛化。本框架不仅支持更凸显了LLM在上下文决策中的优势。在TextWorld中涵盖100余款游戏的实验验证了本方法的优越性能。