Neural processes (NPs) have brought the representation power of parametric deep neural networks and the reliable uncertainty estimation of non-parametric Gaussian processes together. Although recent development of NPs has shown success in both regression and classification, how to adapt NPs to multimodal data has not be carefully studied. For the first time, we propose a new model of NP family for multimodal uncertainty estimation, namely Multimodal Neural Processes. In a holistic and principled way, we develop a dynamic context memory updated by the classification error, a multimodal Bayesian aggregation mechanism to aggregate multimodal representations, and a new attention mechanism for calibrated predictions. In extensive empirical evaluation, our method achieves the state-of-the-art multimodal uncertainty estimation performance, showing its appealing ability of being robust against noisy samples and reliable in out-of-domain detection.
翻译:神经过程(NP)融合了参数化深度神经网络的强大表示能力与非参数化高斯过程的可靠不确定性估计。尽管近期NP在回归与分类任务中均展现出成功应用,但如何将其适配至多模态数据仍未得到充分研究。本文首次提出一种面向多模态不确定性估计的NP族新模型——多模态神经过程。我们以整体化与原则化的方式,构建了基于分类误差驱动的动态上下文记忆、用于聚合多模态表示的多模态贝叶斯聚合机制,以及用于校准预测的新型注意力机制。通过大量经验评估,本方法实现了最先进的多模态不确定性估计性能,展现出对噪声样本的鲁棒性与在域外检测中的可靠性等显著优势。