Dimensional analysis (DA) pays attention to fundamental physical dimensions such as length and mass when modelling scientific and engineering systems. It goes back at least a century to Buckingham's Pi theorem, which characterizes a scientifically meaningful model in terms of a limited number of dimensionless variables. The methodology has only been exploited relatively recently by statisticians for design and analysis of experiments, however, and computer experiments in particular. The basic idea is to build models in terms of new dimensionless quantities derived from the original input and output variables. A scientifically valid formulation has the potential for improved prediction accuracy in principle, but the implementation of DA is far from straightforward. There can be a combinatorial number of possible models satisfying the conditions of the theory. Empirical approaches for finding effective derived variables will be described, and improvements in prediction accuracy will be demonstrated. As DA's dimensionless quantities for a statistical model typically compare the original variables rather than use their absolute magnitudes, DA is less dependent on the choice of experimental ranges in the training data. Hence, we are also able to illustrate sustained accuracy gains even when extrapolating substantially outside the training data.
翻译:量纲分析(DA)在建模科学与工程系统时关注长度和质量等基本物理量纲。其历史可追溯至一个世纪前的白金汉Π定理,该定理通过有限数量的无量纲变量描述具有科学意义的模型。然而,统计学家直到近期才开始将这一方法应用于实验设计分析,特别是计算机实验。其基本思想是利用由原始输入输出变量推导出的新无量纲量构建模型。原则上,科学有效的公式化方法有望提升预测精度,但DA的实施远非直截了当——可能存在的模型组合数量需满足理论约束条件。本文将描述寻找有效派生变量的经验方法,并展示预测精度的提升效果。由于统计模型中DA的无量纲量通常比较原始变量而非使用其绝对值,DA对训练数据中实验范围的选择依赖性较低。因此,即使在外推范围显著超出训练数据的情况下,我们仍能证明其持续保持的精度优势。