Artificial Intelligence-Generated Content (AIGC) has made significant strides, with high-resolution text-to-image (T2I) generation becoming increasingly critical for improving users' Quality of Experience (QoE). Although resource-constrained edge computing adequately supports fast low-resolution T2I generations, achieving high-resolution output still faces the challenge of ensuring image fidelity at the cost of latency. To address this, we first investigate the performance of super-resolution (SR) methods for image enhancement, confirming a fundamental trade-off that lightweight learning-based SR struggles to recover fine details, while diffusion-based SR achieves higher fidelity at a substantial computational cost. Motivated by these observations, we propose an end-edge collaborative generation-enhancement framework. Upon receiving a T2I generation task, the system first generates a low-resolution image based on adaptively selected denoising steps and super-resolution scales at the edge side, which is then partitioned into patches and processed by a region-aware hybrid SR policy. This policy applies a diffusion-based SR model to foreground patches for detail recovery and a lightweight learning-based SR model to background patches for efficient upscaling, ultimately stitching the enhanced ones into the high-resolution image. Experiments show that our system reduces service latency by 33% compared with baselines while maintaining competitive image quality.
翻译:人工智能生成内容(AIGC)已取得显著进展,其中高分辨率文本到图像(T2I)生成对于提升用户体验质量(QoE)日益关键。尽管资源受限的边缘计算足以支持快速的低分辨率T2I生成,但实现高分辨率输出仍面临以延迟为代价确保图像保真度的挑战。为此,我们首先研究了超分辨率(SR)方法在图像增强方面的性能,证实了一个基本权衡:轻量级基于学习的SR难以恢复精细细节,而基于扩散的SR虽能实现更高保真度却需付出巨大计算成本。基于这些观察,我们提出一种端边协同的生成-增强框架。系统接收到T2I生成任务后,首先在边缘侧根据自适应选择的去噪步长与超分辨率尺度生成低分辨率图像,随后将其分割为图像块并通过区域感知混合SR策略进行处理。该策略对前景块采用基于扩散的SR模型以恢复细节,对背景块则采用轻量级基于学习的SR模型以实现高效放大,最终将增强后的图像块拼接成高分辨率图像。实验表明,本系统在保持图像质量竞争力的同时,较基线方法降低了33%的服务延迟。