AI-driven computer vision applications require a profound database to ensure predictable behaviors and performance. Such predictable behaviors are especially important for industrial applications in gaining trust from users. However, such a database is not readily available in industrial applications, and its acquisition is not trivial either. Active learning methods can be applied to ramp up data within a project deployment to iteratively increase the database, and thus the application predictability. Unfortunately, we observe that this often leads to a loss of user trust in the application, which is difficult to regain once lost. This leads to a "chicken-and-egg" dilemma in which neither the database nor the application is developed. In this work, we review state-of-the-art methods and approaches to further boost the database the initial active data ramp-up phase. Here, we focus on recent advancements in GenAI-based data generation and augmentation methods and review their adaptability on an industrial computer vision classification use case. Although we observe a potential for automatic data ramp-up, we also see a domain miss match in between the source (training environment) and target (industrial use-case) - regarding context defined in natural language and object characteristics.
翻译:人工智能驱动的计算机视觉应用需要庞大的数据库来确保可预测的行为与性能。这种可预测性对于工业应用获取用户信任尤为重要。然而在工业应用中,这样的数据库不仅难以直接获取,其采集过程本身也充满挑战。主动学习方法可通过项目部署中的数据迭代来扩充数据库,进而提升应用的可预测性。但遗憾的是,我们发现这往往会导致用户对应用信任度的下降,而一旦失去信任便难以挽回。这种"先有鸡还是先有蛋"的困境导致数据库与应用开发双双停滞。本文系统梳理了在初始主动数据扩充阶段进一步提升数据库质量的最新方法,重点探讨基于GenAI的数据生成与增强技术的最新进展,并在工业计算机视觉分类应用案例中评估其适应性。尽管我们观察到自动数据扩充的潜力,但也发现源域(训练环境)与目标域(工业用例)之间存在领域失配问题——这种失配体现在自然语言上下文定义以及物体特征描述两个层面。