Let $X$ be a $p$-variate random vector and $\widetilde{X}$ a knockoff copy of $X$ (in the sense of \cite{CFJL18}). A new approach for constructing $\widetilde{X}$ (henceforth, NA) has been introduced in \cite{JSPI}. NA has essentially three advantages: (i) To build $\widetilde{X}$ is straightforward; (ii) The joint distribution of $(X,\widetilde{X})$ can be written in closed form; (iii) $\widetilde{X}$ is often optimal under various criteria. However, for NA to apply, $X_1,\ldots, X_p$ should be conditionally independent given some random element $Z$. Our first result is that any probability measure $\mu$ on $\mathbb{R}^p$ can be approximated by a probability measure $\mu_0$ of the form $$\mu_0\bigl(A_1\times\ldots\times A_p\bigr)=E\Bigl\{\prod_{i=1}^p P(X_i\in A_i\mid Z)\Bigr\}.$$ The approximation is in total variation distance when $\mu$ is absolutely continuous, and an explicit formula for $\mu_0$ is provided. If $X\sim\mu_0$, then $X_1,\ldots,X_p$ are conditionally independent. Hence, with a negligible error, one can assume $X\sim\mu_0$ and build $\widetilde{X}$ through NA. Our second result is a characterization of the knockoffs $\widetilde{X}$ obtained via NA. It is shown that $\widetilde{X}$ is of this type if and only if the pair $(X,\widetilde{X})$ can be extended to an infinite sequence so as to satisfy certain invariance conditions. The basic tool for proving this fact is de Finetti's theorem for partially exchangeable sequences. In addition to the quoted results, an explicit formula for the conditional distribution of $\widetilde{X}$ given $X$ is obtained in a few cases. In one of such cases, it is assumed $X_i\in\{0,1\}$ for all $i$.
翻译:设 $X$ 为 $p$ 维随机向量,$\widetilde{X}$ 为其 knockoff 副本(遵循 \cite{CFJL18} 的定义)。\cite{JSPI} 引入了一种构建 $\widetilde{X}$ 的新方法(以下简称 NA)。NA 具有三个主要优点:(i)构建 $\widetilde{X}$ 直接简便;(ii)$(X,\widetilde{X})$ 的联合分布可写出封闭形式;(iii)在多种准则下 $\widetilde{X}$ 通常具有最优性。然而,NA 的应用要求 $X_1,\ldots, X_p$ 在给定某个随机元素 $Z$ 时条件独立。我们的第一个结果是:$\mathbb{R}^p$ 上的任意概率测度 $\mu$ 均可由形如 $$\mu_0\bigl(A_1\times\ldots\times A_p\bigr)=E\Bigl\{\prod_{i=1}^p P(X_i\in A_i\mid Z)\Bigr\}$$ 的概率测度 $\mu_0$ 逼近。当 $\mu$ 绝对连续时,该逼近在总变差距离下成立,且给出了 $\mu_0$ 的显式表达式。若 $X\sim\mu_0$,则 $X_1,\ldots,X_p$ 条件独立。因此,可忽略误差地假设 $X\sim\mu_0$ 并通过 NA 构建 $\widetilde{X}$。我们的第二个结果是对通过 NA 获得的 knockoffs $\widetilde{X}$ 的表征。研究表明,$\widetilde{X}$ 属于此类型当且仅当 $(X,\widetilde{X})$ 可扩展为满足特定不变性条件的无穷序列。证明此结论的基本工具是 de Finetti 的部分可交换序列定理。除上述结果外,我们在若干情形下得到了 $\widetilde{X}$ 在给定 $X$ 下的条件分布显式表达式。其中一种情形假设对所有 $i$ 有 $X_i\in\{0,1\}$。