The preimage or inverse image of a predefined subset of the range of a deterministic function, called inverse set for short, is the set in the domain whose image equals that predefined subset. To quantify the uncertainty present in estimating such a set, one can construct data-dependent inner and outer confidence sets that serve as sub- and super-sets respectively of the true inverse set. Existing methods require strict assumptions with emphasis on dense functional data. In this work, we generalize the estimation of inverse sets to wider range data types by rigorously proving that, by inverting pre-constructed simultaneous confidence intervals (SCI), confidence sets of multiple levels can be simultaneously constructed with the desired confidence non-asymptotically. We provide valid non-parametric bootstrap algorithm and open source code for constructing confidence sets on dense functional data and multiple regression data. The method is exemplified in two distinct applications: identifying regions in North America experiencing rising temperatures using dense functional data and evaluating the impact of statin usage and COVID-19 on the clinical outcomes of hospitalized patients using logistic regression data.
翻译:确定性函数值域中预定义子集的原像或逆像(简称逆集合)是定义域中使得其像等于该预定义子集的集合。为量化估计此类集合时存在的不确定性,可构建数据驱动的内部和外部置信集,分别作为真实逆集合的子集与超集。现有方法需依赖严苛假设,且主要适用于密集函数型数据。本文通过严格证明反演预先构建的同步置信区间(SCI)可同时构造多个置信水平的集合,且在非渐近意义上具有所需置信度,从而将逆集合估计推广至更广泛的数据类型。我们提出了有效的非参数自助算法,并公开了用于密集函数型数据与多元回归数据置信集构建的源代码。该方法通过两个不同应用实例进行展示:利用密集函数型数据识别北美地区气温升高的区域,以及利用逻辑回归数据评估他汀类药物使用及COVID-19对住院患者临床结局的影响。