Carbon-Taxed Transformers: A Green Compression Pipeline for Overgrown Language Models

The accelerating adoption of Large Language Models (LLMs) in software engineering (SE) has brought with it a silent crisis: unsustainable computational cost. While these models demonstrate remarkable capabilities in different SE tasks, they are unmanageably large, slow to deploy, memory-intensive, and carbon-heavy. This reality threatens not only the scalability and accessibility of AI-powered SE, but also its long-term environmental sustainability. The research challenge is clear: we must go beyond accuracy and address efficiency and environmental cost as first-class design constraints. To meet this challenge, we introduce Carbon-Taxed Transformers (CTT), a systematic multi-architectural compression principled pipeline ordering inspired by economic carbon taxation principles. Drawing from the economic concept of carbon pricing, CTT operationalizes a computational carbon tax that penalizes architectural inefficiencies and rewards deployment-ready compression. We evaluate CTT across three core SE tasks: code clone detection, code summarization, and code generation, with models spanning encoder-only, encoder-decoder, and decoder-only architecture. Our results show that CTT delivers on inference: (1) up to 49x memory reduction, (2) time reduction up to 8-10x for clone detection, up to 3x for summarization, and 4-7x for generation, (3) up to 81% reduction in CO2 emissions and (4) CTT retains around 98% accuracy on clone detection, around 89% on summarization, and up to 91% (textual metrics) and 68% (pass@1) for generation. Two ablation studies show that pipeline ordering and individual component contributions are both essential, providing empirical justification for CTT's design and effectiveness. This work establishes a viable path toward responsible AI in SE through aggressive yet performance-preserving compression.

翻译：大型语言模型（LLMs）在软件工程（SE）领域的加速应用带来了一个无声的危机：不可持续的计算成本。尽管这些模型在不同SE任务中展现出卓越能力，但它们体积庞大、部署缓慢、内存密集且碳排放严重。这一现实不仅威胁着AI驱动型SE的可扩展性与可及性，也危及其长期环境可持续性。研究挑战十分明确：我们必须超越准确性，将效率与环境成本作为首要设计约束。为应对这一挑战，我们提出碳税变压器（CTT）——一种受经济碳税原则启发的系统化多架构压缩原则性流程排序方法。借鉴碳定价的经济学概念，CTT实现了一种计算碳税机制，惩罚低效架构并奖励可部署的压缩方案。我们在三项核心SE任务（代码克隆检测、代码摘要生成和代码生成）上评估CTT，模型涵盖仅编码器、编码器-解码器及仅解码器三种架构。结果表明CTT在推理阶段实现了：（1）内存最高减少49倍；（2）克隆检测时间减少8-10倍，摘要任务最高减少3倍，代码生成任务减少4-7倍；（3）CO2排放最高减少81%；（4）CTT在克隆检测中保持约98%的准确率，摘要任务约89%，代码生成任务中文本指标最高达91%且pass@1指标达68%。两项消融研究表明，流程排序与各组件贡献均至关重要，为CTT的设计与有效性提供了实证依据。本工作通过激进但保持性能的压缩策略，为SE领域实现负责任AI确立了可行路径。