For multiple reasons -- such as avoiding overtraining from one data set or because of having received numerical estimates for some parameters in a model from an alternative source -- it is sometimes useful to divide a model's parameters into one group of primary parameters and one group of nuisance parameters. However, uncertainty in the values of nuisance parameters is an inevitable factor that impacts the model's reliability. This paper examines the issue of uncertainty calculation for primary parameters of interest in the presence of nuisance parameters. We illustrate a general procedure on two distinct model forms: 1) the GARCH time series model with univariate nuisance parameter and 2) multiple hidden layer feed-forward neural network models with multivariate nuisance parameters. Leveraging an existing theoretical framework for nuisance parameter uncertainty, we show how to modify the confidence regions for the primary parameters while considering the inherent uncertainty introduced by nuisance parameters. Furthermore, our study validates the practical effectiveness of adjusted confidence regions that properly account for uncertainty in nuisance parameters. Such an adjustment helps data scientists produce results that more honestly reflect the overall uncertainty.
翻译:出于多种原因——例如避免单一数据集导致的过拟合,或因从其他来源获得了模型中某些参数的数值估计——有时将模型参数划分为一组主要参数和一组冗余参数是有益的。然而,冗余参数值的不确定性是影响模型可靠性的一个不可避免的因素。本文探讨了在存在冗余参数的情况下,如何计算感兴趣的主要参数的不确定性问题。我们通过两种不同的模型形式阐述了一个通用流程:1)具有单变量冗余参数的GARCH时间序列模型;2)具有多变量冗余参数的多隐藏层前馈神经网络模型。借助现有的冗余参数不确定性理论框架,我们展示了如何在考虑冗余参数引入的内在不确定性的同时,修正主要参数的置信域。此外,我们的研究验证了经过调整的置信域在实际中的有效性,这些置信域恰当地考虑了冗余参数的不确定性。这种调整有助于数据科学家得出更能真实反映整体不确定性的结果。