Accurate seismic velocity estimations are vital to understanding Earth's subsurface structures, assessing natural resources, and evaluating seismic hazards. Machine learning-based inversion algorithms have shown promising performance in regional (i.e., for exploration) and global velocity estimation, while their effectiveness hinges on access to large and diverse training datasets whose distributions generally cover the target solutions. Additionally, enhancing the precision and reliability of velocity estimation also requires incorporating prior information, e.g., geological classes, well logs, and subsurface structures, but current statistical or neural network-based methods are not flexible enough to handle such multi-modal information. To address both challenges, we propose to use conditional generative diffusion models for seismic velocity synthesis, in which we readily incorporate those priors. This approach enables the generation of seismic velocities that closely match the expected target distribution, offering datasets informed by both expert knowledge and measured data to support training for data-driven geophysical methods. We demonstrate the flexibility and effectiveness of our method through training diffusion models on the OpenFWI dataset under various conditions, including class labels, well logs, reflectivity images, as well as the combination of these priors. The performance of the approach under out-of-distribution conditions further underscores its generalization ability, showcasing its potential to provide tailored priors for velocity inverse problems and create specific training datasets for machine learning-based geophysical applications.
翻译:准确的地震速度估计对于理解地球内部结构、评估自然资源和预测地震灾害至关重要。基于机器学习的反演算法在区域(即勘探)和全球速度估计中展现出优异性能,但其有效性依赖于获取大规模且多样化的训练数据集,这些数据集的分布通常需覆盖目标解空间。此外,提升速度估计的精度与可靠性还需要融入先验信息,例如地质类别、测井数据和地下结构,但现有的统计或神经网络方法在处理此类多模态信息时灵活性不足。为应对这两项挑战,我们提出使用条件生成式扩散模型进行地震速度合成,可便捷地将这些先验信息纳入模型。该方法能够生成与预期目标分布高度吻合的地震速度模型,提供融合专家知识与实测数据的训练数据集,从而支持数据驱动地球物理方法的训练。通过在OpenFWI数据集上训练扩散模型,并在类别标签、测井数据、反射率图像及其组合等多种条件下进行测试,我们验证了该方法的高度灵活性与有效性。该模型在分布外条件下的性能表现进一步凸显了其泛化能力,展示了其为速度反演问题定制先验信息,并为基于机器学习的地球物理应用创建特定训练数据集的巨大潜力。