Similarity-Distance-Magnitude Universal Verification

We solve the neural network robustness problem by adding Similarity (i.e., correctly predicted depth-matches into training)-awareness and Distance-to-training-distribution-awareness to the existing output Magnitude (i.e., decision-boundary)-awareness of the softmax function. The resulting sdm activation function provides strong signals of the relative epistemic (reducible) predictive uncertainty. We use this novel behavior to further address the complementary HCI problem of mapping the output to human-interpretable summary statistics over relevant partitions of a held-out calibration set. Estimates of prediction-conditional uncertainty are obtained via a parsimonious learned transform over the class-conditional empirical CDFs of the output of a final-layer sdm activation function. For decision-making and as an intrinsic model check, estimates of class-conditional accuracy are obtained by further partitioning the high-probability regions of this calibrated output into class-conditional, region-specific CDFs. The uncertainty estimates from sdm calibration are remarkably robust to test-time distribution shifts and out-of-distribution inputs; incorporate awareness of the effective sample size; provide estimates of uncertainty from the learning and data splitting processes; and are well-suited for selective classification and conditional branching for additional test-time compute based on the predictive uncertainty, as for selective LLM generation, routing, and composition over multiple models and retrieval. Finally, we construct sdm networks, LLMs with uncertainty-aware verification and interpretability-by-exemplar as intrinsic properties. We provide open-source software implementing these results.

翻译：我们通过向现有softmax函数的输出幅度（即决策边界）感知能力中添加相似性（即正确预测的深度匹配）感知和训练分布距离感知，解决了神经网络鲁棒性问题。由此产生的sdm激活函数能够提供关于相对认知（可约简）预测不确定性的强信号。我们利用这种新颖特性进一步解决互补的人机交互问题：将输出映射到预留校准集相关分区上的人类可解释汇总统计量。通过在学习到的简约变换上对最终层sdm激活函数输出的类条件经验累积分布函数进行处理，获得预测条件不确定性的估计值。为支持决策制定并作为内在模型检验，通过将校准输出的高概率区域进一步划分为类条件、区域特定的累积分布函数，获得类条件准确率的估计值。sdm校准产生的不确定性估计具有以下显著特性：对测试时分布偏移和分布外输入表现出卓越的鲁棒性；包含有效样本量的感知能力；提供来自学习过程和数据划分过程的不确定性估计；特别适用于选择性分类和基于预测不确定性的条件分支处理（例如针对选择性大语言模型生成、路由以及多模型与检索的组合场景）。最后，我们构建了sdm网络——将不确定性感知验证和范例可解释性作为内在属性的大语言模型。我们提供了实现这些成果的开源软件。

相关内容

SDM

关注 11

数据挖掘是从数据中发现有价值的知识的计算过程，是现代数据科学的核心。它在许多领域有着巨大的应用，包括科学、工程、医疗保健、商业和医学。这些字段中的典型数据集是大的、复杂的，而且通常是有噪声的。从这些数据集中提取知识需要使用复杂的、高性能的、有原则的分析技术和算法。这些技术反过来又需要在高性能计算基础设施上的实现，这些基础设施需要经过仔细的性能调优。强大的可视化技术和有效的用户界面对于使数据挖掘工具吸引来自不同学科的研究人员、分析师、数据科学家和应用程序开发人员以及利益相关者的可用性也至关重要。SDM确立了自己在数据挖掘领域的领先地位，并为解决这些问题的研究人员提供了一个在同行评审论坛上展示其工作的场所。SDM强调原则方法和坚实的数学基础，以其高质量和高影响力的技术论文而闻名，并提供强大的研讨会和教程程序(包括在会议注册中)。官网地址：http://dblp.uni-trier.de/db/conf/sdm/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日