This paper presents a novel method to enhance the reliability of image classification models during deployment in the face of transient hardware errors. By utilizing enriched text embeddings derived from GPT-3 with question prompts per class and CLIP pretrained text encoder, we investigate their impact as an initialization for the classification layer. Our approach achieves a remarkable $5.5\times$ average increase in hardware reliability (and up to $14\times$) across various architectures in the most critical layer, with minimal accuracy drop ($0.3\%$ on average) compared to baseline PyTorch models. Furthermore, our method seamlessly integrates with any image classification backbone, showcases results across various network architectures, decreases parameter and FLOPs overhead, and follows a consistent training recipe. This research offers a practical and efficient solution to bolster the robustness of image classification models against hardware failures, with potential implications for future studies in this domain. Our code and models are released at https://github.com/TalalWasim/TextGuidedResilience.
翻译:本文提出了一种新方法,用于增强图像分类模型在部署过程中面对瞬态硬件错误时的可靠性。通过利用GPT-3结合每类问题提示生成的丰富文本嵌入以及CLIP预训练文本编码器,我们探究了它们作为分类层初始化的影响。我们的方法在最关键的层中,使各种架构的硬件可靠性平均提升了$5.5\times$(最高达$14\times$),而相较于基线PyTorch模型,准确率下降极小(平均$0.3\%$)。此外,该方法可与任何图像分类骨干架构无缝集成,在不同网络架构上展示了结果,降低了参数和FLOPs开销,并遵循一致的训练策略。本研究提供了一种实用高效的方案,以增强图像分类模型对硬件故障的鲁棒性,并为该领域的未来研究提供了潜在启示。我们的代码和模型已发布于https://github.com/TalalWasim/TextGuidedResilience。