Many learning problems involve multiple patterns and varying degrees of uncertainty dependent on the covariates. Advances in Deep Learning (DL) have addressed these issues by learning highly nonlinear input-output dependencies. However, model interpretability and Uncertainty Quantification (UQ) have often straggled behind. In this context, we introduce the Competitive/Collaborative Fusion of Experts (CoCoAFusE), a novel, Bayesian Covariates-Dependent Modeling technique. CoCoAFusE builds on the very philosophy behind Mixtures of Experts (MoEs), blending predictions from several simple sub-models (or "experts") to achieve high levels of expressiveness while retaining a substantial degree of local interpretability. Our formulation extends that of a classical Mixture of Experts by contemplating the fusion of the experts' distributions in addition to their more usual mixing (i.e., superimposition). Through this additional feature, CoCoAFusE better accommodates different scenarios for the intermediate behavior between generating mechanisms, resulting in tighter credible bounds on the response variable. Indeed, only resorting to mixing, as in classical MoEs, may lead to multimodality artifacts, especially over smooth transitions. Instead, CoCoAFusE can avoid these artifacts even under the same structure and priors for the experts, leading to greater expressiveness and flexibility in modeling. This new approach is showcased extensively on a suite of motivating numerical examples and a collection of real-data ones, demonstrating its efficacy in tackling complex regression problems where uncertainty is a key quantity of interest.
翻译:许多学习问题涉及多种模式以及随协变量变化的不确定性程度。深度学习(DL)的进展通过学习高度非线性的输入-输出依赖关系来解决这些问题。然而,模型可解释性与不确定性量化(UQ)的发展往往滞后。在此背景下,我们提出竞争/协作专家融合(CoCoAFusE),一种新颖的贝叶斯协变量依赖建模技术。CoCoAFusE基于专家混合(MoEs)的核心思想,融合多个简单子模型(或称“专家”)的预测,在实现高度表达力的同时保持相当程度的局部可解释性。我们的公式扩展了经典专家混合模型,不仅考虑专家分布的常规混合(即叠加),还进一步探索其分布的融合。通过这一新增特性,CoCoAFusE能更好地适应不同生成机制间过渡行为的多种场景,从而为响应变量提供更紧凑的置信区间。事实上,如经典MoE仅采用混合策略,可能导致多模态伪影,尤其在平滑过渡区域。相反,即使在相同专家结构和先验条件下,CoCoAFusE也能避免此类伪影,从而获得更强的建模表达力与灵活性。我们通过一系列启发性数值示例和真实数据案例全面展示了这一新方法的有效性,证明其在处理以不确定性为核心关注点的复杂回归问题中的卓越性能。