Given limited and costly computational infrastructure, resource efficiency is a key requirement for large language models (LLMs). Efficient LLMs increase service capacity for providers and reduce latency and API costs for users. Recent resource consumption threats induce excessive generation, degrading model efficiency and harming both service availability and economic sustainability. This survey presents a systematic review of threats to resource consumption in LLMs. We further establish a unified view of this emerging area by clarifying its scope and examining the problem along the full pipeline from threat induction to mechanism understanding and mitigation. Our goal is to clarify the problem landscape for this emerging area, thereby providing a clearer foundation for characterization and mitigation.
翻译:鉴于计算基础设施有限且成本高昂,资源效率成为大型语言模型(LLMs)的关键要求。高效的LLMs能提升服务提供商的服务容量,同时降低用户端的延迟和API成本。当前出现的资源消耗威胁会诱发过度生成现象,从而降低模型效率,损害服务可用性与经济可持续性。本综述系统性地审视了LLMs面临的资源消耗威胁,通过厘清该领域的研究范畴,并沿着从威胁诱发、机制解析到缓解措施的全流程进行考察,为此新兴领域建立了统一的研究框架。我们的目标在于明确这一新兴领域的问题全景,从而为特征刻画与威胁缓解提供更清晰的研究基础。