In this work, we propose a metric called Number of Thoughts (NofT) to determine the difficulty of tasks pre-prompting and support Large Language Models (LLMs) in production contexts. By setting thresholds based on the number of thoughts, this metric can discern the difficulty of prompts and support more effective prompt routing. A 2% decrease in latency is achieved when routing prompts from the MathInstruct dataset through quantized, distilled versions of Deepseek with 1.7 billion, 7 billion, and 14 billion parameters. Moreover, this metric can be used to detect adversarial prompts used in prompt injection attacks with high efficacy. The Number of Thoughts can inform a classifier that achieves 95% accuracy in adversarial prompt detection. Our experiments ad datasets used are available on our GitHub page: https://github.com/rymarinelli/Number_Of_Thoughts/tree/main.
翻译:本研究提出了一种称为"思维数量"的度量指标,用于在提示前预判任务难度,并在生产环境中为大型语言模型提供支持。通过设定基于思维数量的阈值,该指标能够识别提示的难度,并支持更有效的提示路由。当将MathInstruct数据集中的提示通过参数量分别为17亿、70亿和140亿的量化蒸馏版Deepseek模型进行路由时,实现了2%的延迟降低。此外,该指标还可高效检测提示注入攻击中使用的对抗性提示。基于思维数量构建的分类器在对抗性提示检测中达到了95%的准确率。本实验所用代码及数据集已在GitHub页面开源:https://github.com/rymarinelli/Number_Of_Thoughts/tree/main。