Through reading the documentation in the context, tool-using language models can dynamically extend their capability using external tools. The cost is that we have to input lengthy documentation every time the model needs to use the tool, occupying the input window as well as slowing down the decoding process. Given the progress in general-purpose compression, soft context compression is a suitable approach to alleviate the problem. However, when compressing tool documentation, existing methods suffer from the weaknesses of key information loss (specifically, tool/parameter name errors) and difficulty in adjusting the length of compressed sequences based on documentation lengths. To address these problems, we propose two strategies for compressing tool documentation into concise and precise summary sequences for tool-using language models. 1) Selective compression strategy mitigates key information loss by deliberately retaining key information as raw text tokens. 2) Block compression strategy involves dividing tool documentation into short chunks and then employing a fixed-length compression model to achieve variable-length compression. This strategy facilitates the flexible adjustment of the compression ratio. Results on API-Bank and APIBench show that our approach reaches a performance comparable to the upper-bound baseline under up to 16x compression ratio.
翻译:通过阅读上下文中的文档,工具调用语言模型能够利用外部工具动态扩展其能力。其代价在于,每当模型需要使用工具时,我们都必须输入冗长的文档,这不仅占用了输入窗口,还降低了解码速度。鉴于通用压缩技术的进展,软上下文压缩是缓解此问题的合适途径。然而,在压缩工具文档时,现有方法存在关键信息丢失(特别是工具/参数名称错误)以及难以根据文档长度调整压缩序列长度的弱点。为解决这些问题,我们提出了两种策略,将工具文档压缩为面向工具调用语言模型的简明精确摘要序列。1)选择性压缩策略通过刻意保留关键信息为原始文本标记来减轻关键信息丢失。2)分块压缩策略将工具文档划分为短块,然后采用固定长度的压缩模型实现可变长度压缩。该策略便于灵活调整压缩比。在API-Bank和APIBench上的实验结果表明,我们的方法在高达16倍的压缩比下,达到了与上限基线相当的性能。