We consider the problem of estimating assortment probabilities, which is common in operations management applications, including product bundling, advertising, etc. Existing approaches typically model each assortment as a category and apply multinomial models to estimate the choice probabilities; while computationally convenient, these methods do not exploit independence structures in the joint distribution and may therefore be statistically inefficient when the total number of items is large. Using the representation from Bahadur (1959), we relate the sparsity of the generalized correlation coefficients to the independence structure of the binary components. We formulate the problem as estimating a high-dimensional vector of generalized correlation coefficients, together with low or moderate-dimensional nuisance parameters corresponding to the marginal probabilities. We develop a regularized adversarial estimator that attains the optimal rate under standard regularity conditions while remaining computationally feasible. The framework naturally extends to settings with covariates. We apply the proposed estimators to causal inference with multiple binary treatments and show substantial finite-sample improvements over non-adaptive methods. Numerical studies corroborate the theoretical results.
翻译:我们研究了运营管理中常见的问题——产品组合概率估计,该问题广泛应用于产品捆绑、广告等领域。现有方法通常将每个产品组合视为一个类别,并采用多项模型估计选择概率;尽管计算简便,但这些方法未利用联合分布中的独立结构,因此在产品总数较大的情况下统计效率可能较低。基于Bahadur(1959)的表示方法,我们将广义相关系数的稀疏性与二值分量间的独立结构相关联。我们将问题形式化为对高维广义相关系数向量进行估计,同时处理对应边际概率的低维或中等维度扰动参数。我们开发了一种正则化对抗估计量,该估计量在标准正则条件下达到最优收敛速率,同时保持计算可行性。该框架可自然扩展至含协变量的场景。我们将所提估计量应用于多重二值处理的因果推断中,并证明相比非自适应方法在有限样本下具有显著改进。数值实验验证了理论结果。