We propose Zero-Error Horizon (ZEH) for trustworthy LLMs, which represents the maximum range that a model can solve without any errors. While ZEH itself is simple, we demonstrate that evaluating the ZEH of state-of-the-art LLMs yields abundant insights. For example, by evaluating the ZEH of GPT-5.2, we found that GPT-5.2 cannot even compute the parity of a short string like 11000, and GPT-5.2 cannot determine whether the parentheses in ((((()))))) are balanced. This is surprising given the excellent capabilities of GPT-5.2. The fact that LLMs make mistakes on such simple problems serves as an important lesson when applying LLMs to safety-critical domains. By applying ZEH to Qwen2.5 and conducting detailed analysis, we found that while ZEH correlates with accuracy, the detailed behaviors differ, and ZEH provides clues about the emergence of algorithmic capabilities. Finally, while computing ZEH incurs significant computational cost, we discuss how to mitigate this cost by achieving up to one order of magnitude speedup using tree structures and online softmax.
翻译:我们针对可信赖大语言模型提出零误差视野(ZEH)概念,它代表模型能够无差错解决问题的最大范围。尽管ZEH本身定义简洁,但我们证明评估当前最先进大语言模型的ZEH能产生丰富的洞见。例如,通过评估GPT-5.2的ZEH,我们发现GPT-5.2甚至无法计算像11000这样的短字符串的奇偶性,也无法判断((((()))))中的括号是否平衡。考虑到GPT-5.2卓越的综合能力,这一结果令人惊讶。大语言模型在此类简单问题上出现错误的事实,为将其应用于安全关键领域提供了重要警示。通过对Qwen2.5实施ZEH评估并进行详细分析,我们发现虽然ZEH与准确率存在相关性,但具体行为模式存在差异,且ZEH为算法能力的涌现机制提供了线索。最后,尽管计算ZEH需要高昂的计算成本,我们探讨了如何通过树状结构和在线softmax技术实现高达一个数量级的加速来缓解此问题。