Scalable Concept Extraction in Industry 4.0

The industry 4.0 is leveraging digital technologies and machine learning techniques to connect and optimize manufacturing processes. Central to this idea is the ability to transform raw data into human understandable knowledge for reliable data-driven decision-making. Convolutional Neural Networks (CNNs) have been instrumental in processing image data, yet, their ``black box'' nature complicates the understanding of their prediction process. In this context, recent advances in the field of eXplainable Artificial Intelligence (XAI) have proposed the extraction and localization of concepts, or which visual cues intervene on the prediction process of CNNs. This paper tackles the application of concept extraction (CE) methods to industry 4.0 scenarios. To this end, we modify a recently developed technique, ``Extracting Concepts with Local Aggregated Descriptors'' (ECLAD), improving its scalability. Specifically, we propose a novel procedure for calculating concept importance, utilizing a wrapper function designed for CNNs. This process is aimed at decreasing the number of times each image needs to be evaluated. Subsequently, we demonstrate the potential of CE methods, by applying them in three industrial use cases. We selected three representative use cases in the context of quality control for material design (tailored textiles), manufacturing (carbon fiber reinforcement), and maintenance (photovoltaic module inspection). In these examples, CE was able to successfully extract and locate concepts directly related to each task. This is, the visual cues related to each concept, coincided with what human experts would use to perform the task themselves, even when the visual cues were entangled between multiple classes. Through empirical results, we show that CE can be applied for understanding CNNs in an industrial context, giving useful insights that can relate to domain knowledge.

翻译：工业4.0正利用数字技术和机器学习方法连接并优化制造流程。其核心理念在于将原始数据转化为人类可理解的知识，以实现可靠的数据驱动决策。卷积神经网络（CNN）在处理图像数据方面发挥了重要作用，但其“黑箱”特性增加了理解预测过程的复杂性。在此背景下，可解释人工智能（XAI）领域的最新进展提出了概念的提取与定位，即识别哪些视觉线索介入CNN的预测过程。本文探讨了将概念提取（CE）方法应用于工业4.0场景。为此，我们改进了近期提出的“基于局部聚合描述符的概念提取”（ECLAD）技术，提升了其可扩展性。具体而言，我们提出了一种新的概念重要性计算方法，利用为CNN设计的包装函数，旨在减少每张图像需要评估的次数。随后，我们通过三个工业用例展示了CE方法的潜力。我们选取了三个代表性用例，涵盖材料设计（定制纺织品）的质量控制、制造（碳纤维增强材料）以及维护（光伏组件检测）。在这些示例中，CE成功提取并定位了与各任务直接相关的概念。即每个概念相关的视觉线索与人类专家执行任务时会使用的线索一致，即使这些视觉线索在多个类别间存在混淆。通过实证结果，我们证明CE可应用于理解工业背景下的CNN，并提供与领域知识相关的有用见解。