Joint Probability Estimation of Many Binary Outcomes via Localized Adversarial Lasso

In this work we consider estimating the probability of many (possibly dependent) binary outcomes which is at the core of many applications, e.g., multi-level treatments in causal inference, demands for bundle of products, etc. Without further conditions, the probability distribution of an M dimensional binary vector is characterized by exponentially in M coefficients which can lead to a high-dimensional problem even without the presence of covariates. Understanding the (in)dependence structure allows us to substantially improve the estimation as it allows for an effective factorization of the probability distribution. In order to estimate the probability distribution of a M dimensional binary vector, we leverage a Bahadur representation that connects the sparsity of its coefficients with independence across the components. We propose to use regularized and adversarial regularized estimators to obtain an adaptive estimator with respect to the dependence structure which allows for rates of convergence to depend on this intrinsic (lower) dimension. These estimators are needed to handle several challenges within this setting, including estimating nuisance parameters, estimating covariates, and nonseparable moment conditions. Our main results consider the presence of (low dimensional) covariates for which we propose a locally penalized estimator. We provide pointwise rates of convergence addressing several issues in the theoretical analyses as we strive for making a computationally tractable formulation. We apply our results in the estimation of causal effects with multiple binary treatments and show how our estimators can improve the finite sample performance when compared with non-adaptive estimators that try to estimate all the probabilities directly. We also provide simulations that are consistent with our theoretical findings.

翻译：本研究探讨多元（可能相关）二元结果的概率估计问题，该问题是众多应用领域的核心，例如因果推断中的多水平处理、产品组合需求预测等。在无附加条件的情况下，M维二元向量的概率分布需要以M为指数增长的系数进行刻画，即使不存在协变量也会导致高维问题。理解（非）依赖结构能通过概率分布的有效分解显著改进估计效果。为估计M维二元向量的概率分布，我们利用Bahadur表示法建立其系数稀疏性与分量独立性之间的关联。我们提出采用正则化及对抗正则化估计量，以获得对依赖结构具有自适应性的估计器，从而使收敛速率能够依赖这一内在（更低）维度。这些估计器需要应对本设定中的若干挑战，包括干扰参数估计、协变量估计以及不可分离矩条件。我们的主要研究结果考虑了（低维）协变量的存在，为此提出局部惩罚估计量。在构建计算可行的理论框架过程中，我们针对理论分析的若干问题给出了逐点收敛速率。我们将研究成果应用于多二元处理的因果效应估计，证明相较于试图直接估计所有概率的非自适应估计器，我们的估计器能提升有限样本性能。同时提供的数值模拟结果与理论发现相一致。