Given limited and costly computational infrastructure, resource efficiency is a key requirement for large language models (LLMs). Efficient LLMs increase service capacity for providers and reduce latency and API costs for users. Recent resource consumption threats induce excessive generation, degrading model efficiency and harming both service availability and economic sustainability. This survey presents a systematic review of threats to resource consumption in LLMs. We further establish a unified view of this emerging area by clarifying its scope and examining the problem along the full pipeline from threat induction to mechanism understanding and mitigation. Our goal is to clarify the problem landscape for this emerging area, thereby providing a clearer foundation for characterization and mitigation.
翻译:鉴于计算基础设施有限且成本高昂,资源效率已成为大型语言模型(LLMs)的关键要求。高效的LLM能提升服务提供商的承载能力,同时降低用户延迟和API调用成本。近期出现的资源消耗威胁会诱发模型过度生成,从而降低模型效率,损害服务可用性与经济可持续性。本综述系统梳理了LLM面临的资源消耗威胁,通过界定研究范畴、沿循从威胁诱发到机制解析再到缓解措施的完整技术链条进行考察,为该新兴领域建立统一的研究框架。本文旨在厘清该领域的问题全景,从而为威胁特征刻画与缓解策略研究奠定更清晰的基础。