Deep learning models can extract predictive and actionable information from complex inputs. The richer the inputs, the better these models usually perform. However, models that leverage rich inputs (e.g., multi-modality) can be difficult to deploy widely, because some inputs may be missing at inference. Current popular solutions to this problem include marginalization, imputation, and training multiple models. Marginalization can obtain calibrated predictions but it is computationally costly and therefore only feasible for low dimensional inputs. Imputation may result in inaccurate predictions because it employs point estimates for missing variables and does not work well for high dimensional inputs (e.g., images). Training multiple models whereby each model takes different subsets of inputs can work well but requires knowing missing input patterns in advance. Furthermore, training and retaining multiple models can be costly. We propose an efficient way to learn both the conditional distribution using full inputs and the marginal distributions. Our method, Knockout, randomly replaces input features with appropriate placeholder values during training. We provide a theoretical justification of Knockout and show that it can be viewed as an implicit marginalization strategy. We evaluate Knockout in a wide range of simulations and real-world datasets and show that it can offer strong empirical performance.
翻译:深度学习模型能够从复杂输入中提取具有预测性和可操作性的信息。输入信息越丰富,这些模型通常表现越好。然而,利用丰富输入(例如多模态)的模型可能难以广泛部署,因为在推理时某些输入可能缺失。当前针对此问题的流行解决方案包括边缘化、插值以及训练多个模型。边缘化可以获得校准后的预测,但计算成本高昂,因此仅适用于低维输入。插值可能导致预测不准确,因为它对缺失变量采用点估计,且不适用于高维输入(如图像)。训练多个模型(每个模型接收不同的输入子集)可能效果良好,但需要预先知道缺失输入的模式。此外,训练和保留多个模型的成本可能很高。我们提出了一种高效学习完整输入条件下的条件分布及边缘分布的方法。我们的方法Knockout在训练过程中随机用适当的占位值替换输入特征。我们为Knockout提供了理论依据,并证明其可视为一种隐式边缘化策略。我们在广泛的模拟和真实数据集上评估了Knockout,结果表明其能够提供强大的实证性能。