Manufacturing wafers is an intricate task involving thousands of steps. Defect Pattern Recognition (DPR) of wafer maps is crucial for determining the root cause of production defects, which may further provide insight for yield improvement in wafer foundry. During manufacturing, various defects may appear standalone in the wafer or may appear as different combinations. Identifying multiple defects in a wafer is generally harder compared to identifying a single defect. Recently, deep learning methods have gained significant traction in mixed-type DPR. However, the complexity of defects requires complex and large models making them very difficult to operate on low-memory embedded devices typically used in fabrication labs. Another common issue is the unavailability of labeled data to train complex networks. In this work, we propose an unsupervised training routine to distill the knowledge of complex pre-trained models to lightweight deployment-ready models. We empirically show that this type of training compresses the model without sacrificing accuracy despite being up to 10 times smaller than the teacher model. The compressed model also manages to outperform contemporary state-of-the-art models.
翻译:晶圆制造是一项涉及数千道工序的复杂任务。晶圆图的缺陷模式识别(DPR)对于确定生产缺陷的根本原因至关重要,进而可为晶圆代工厂的良率提升提供洞察。在制造过程中,各种缺陷可能单独出现在晶圆上,也可能以不同组合形式呈现。与识别单一缺陷相比,识别晶圆中的多重缺陷通常更加困难。近年来,深度学习方法在混合型DPR领域获得了显著关注。然而,缺陷的复杂性需要复杂且庞大的模型,这使得它们极难运行于晶圆制造实验室常用的低内存嵌入式设备上。另一个常见问题是缺乏可用于训练复杂网络的标注数据。在本工作中,我们提出了一种无监督训练流程,将复杂预训练模型的知识蒸馏至轻量级、可直接部署的模型。实验证明,这种训练方式可在不牺牲精度的前提下压缩模型,尽管其规模比教师模型小多达10倍。该压缩模型还成功超越了当前最先进的模型。