Simulation of microstructures and machine learning

from arxiv, Preprint of: K. Schladitz, C. Redenbach, T. Barisin, C. Jung, N. Jeziorski, L. Bosnar, J. Fulir, P. Gospodneti\'c: Simulation of Microstructures and Machine Learning, published in Continuum Models and Discrete Systems by F. Willot, J. Dirrenberger, S. Forest, D. Jeulin, A.V. Cherkaev (eds), 2024, Springer Cham. The final version is https://doi.org/10.1007/978-3-031-58665-1

Machine learning offers attractive solutions to challenging image processing tasks. Tedious development and parametrization of algorithmic solutions can be replaced by training a convolutional neural network or a random forest with a high potential to generalize. However, machine learning methods rely on huge amounts of representative image data along with a ground truth, usually obtained by manual annotation. Thus, limited availability of training data is a critical bottleneck. We discuss two use cases: optical quality control in industrial production and segmenting crack structures in 3D images of concrete. For optical quality control, all defect types have to be trained but are typically not evenly represented in the training data. Additionally, manual annotation is costly and often inconsistent. It is nearly impossible in the second case: segmentation of crack systems in 3D images of concrete. Synthetic images, generated based on realizations of stochastic geometry models, offer an elegant way out. A wide variety of structure types can be generated. The within structure variation is naturally captured by the stochastic nature of the models and the ground truth is for free. Many new questions arise. In particular, which characteristics of the real image data have to be met to which degree of fidelity.

翻译：机器学习为具有挑战性的图像处理任务提供了极具吸引力的解决方案。繁琐的算法开发与参数化过程可被训练卷积神经网络或随机森林所替代，这些方法具备强大的泛化潜力。然而，机器学习方法依赖于大量具有真实标注的代表性图像数据，而真实标注通常通过人工标注获得。因此，训练数据的有限可用性成为关键瓶颈。本文探讨两个应用场景：工业生产中的光学质量检测，以及混凝土三维图像中裂缝结构的分割。对于光学质量检测，所有缺陷类型均需进行训练，但它们在训练数据中的分布通常不均。此外，人工标注成本高昂且常存在不一致性。在第二种场景——混凝土三维图像中裂缝系统的分割——中，人工标注几乎无法实现。基于随机几何模型实现生成的合成图像提供了一种优雅的解决方案。该方法可生成多样化的结构类型，其结构内部变异通过模型的随机性自然体现，且真实标注可自动获取。由此衍生出诸多新问题，特别是：真实图像数据的哪些特征需要被满足？需要达到何种保真度？

相关内容

Machine Learning

关注 2251

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

《图机器学习》课程

专知会员服务

49+阅读 · 2024年2月18日

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

最新《Transformers模型》教程，64页ppt

专知会员服务

326+阅读 · 2020年11月26日