Effective verification and validation techniques for modern scientific machine learning workflows are challenging to devise. Statistical methods are abundant and easily deployed, but often rely on speculative assumptions about the data and methods involved. Error bounds for classical interpolation techniques can provide mathematically rigorous estimates of accuracy, but often are difficult or impractical to determine computationally. In this work, we present a best-of-both-worlds approach to verifiable scientific machine learning by demonstrating that (1) multiple standard interpolation techniques have informative error bounds that can be computed or estimated efficiently; (2) comparative performance among distinct interpolants can aid in validation goals; (3) deploying interpolation methods on latent spaces generated by deep learning techniques enables some interpretability for black-box models. We present a detailed case study of our approach for predicting lift-drag ratios from airfoil images. Code developed for this work is available in a public Github repository.
翻译:针对现代科学机器学习工作流程的有效验证与确认技术难以设计。统计方法虽然丰富且易于部署,但往往依赖关于数据及方法的推测性假设。经典插值技术的误差界能提供数学上严谨的精度估计,但通常难以计算或实际确定。本研究提出一种兼具两者优势的可验证科学机器学习方法,论证:(1)多种标准插值技术具有信息量丰富的误差界,可高效计算或估计;(2)不同插值函数的性能比较有助于实现验证目标;(3)在深度学习生成的潜空间上部署插值方法,可为黑箱模型提供一定可解释性。我们以翼型图像预测升阻比为例,详细展示了该方法的案例研究。本研究所开发的代码已公开于Github仓库。