The Influence of Nuisance Parameter Uncertainty on Statistical Inference in Practical Data Science Models

For multiple reasons -- such as avoiding overtraining from one data set or because of having received numerical estimates for some parameters in a model from an alternative source -- it is sometimes useful to divide a model's parameters into one group of primary parameters and one group of nuisance parameters. However, uncertainty in the values of nuisance parameters is an inevitable factor that impacts the model's reliability. This paper examines the issue of uncertainty calculation for primary parameters of interest in the presence of nuisance parameters. We illustrate a general procedure on two distinct model forms: 1) the GARCH time series model with univariate nuisance parameter and 2) multiple hidden layer feed-forward neural network models with multivariate nuisance parameters. Leveraging an existing theoretical framework for nuisance parameter uncertainty, we show how to modify the confidence regions for the primary parameters while considering the inherent uncertainty introduced by nuisance parameters. Furthermore, our study validates the practical effectiveness of adjusted confidence regions that properly account for uncertainty in nuisance parameters. Such an adjustment helps data scientists produce results that more honestly reflect the overall uncertainty.

翻译：出于多种原因——例如避免单一数据集导致的过拟合，或因从其他来源获得了模型中某些参数的数值估计——有时将模型参数划分为一组主要参数和一组冗余参数是有益的。然而，冗余参数值的不确定性是影响模型可靠性的一个不可避免的因素。本文探讨了在存在冗余参数的情况下，如何计算感兴趣的主要参数的不确定性问题。我们通过两种不同的模型形式阐述了一个通用流程：1）具有单变量冗余参数的GARCH时间序列模型；2）具有多变量冗余参数的多隐藏层前馈神经网络模型。借助现有的冗余参数不确定性理论框架，我们展示了如何在考虑冗余参数引入的内在不确定性的同时，修正主要参数的置信域。此外，我们的研究验证了经过调整的置信域在实际中的有效性，这些置信域恰当地考虑了冗余参数的不确定性。这种调整有助于数据科学家得出更能真实反映整体不确定性的结果。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日