Unveiling Ontological Commitment in Multi-Modal Foundation Models

from arxiv, Qualitative Reasoning Workshop 2024 (QR2024) colocated with ECAI2024, camera-ready submission; first two authors contributed equally; 10 pages, 4 figures, 3 tables

Ontological commitment, i.e., used concepts, relations, and assumptions, are a corner stone of qualitative reasoning (QR) models. The state-of-the-art for processing raw inputs, though, are deep neural networks (DNNs), nowadays often based off from multimodal foundation models. These automatically learn rich representations of concepts and respective reasoning. Unfortunately, the learned qualitative knowledge is opaque, preventing easy inspection, validation, or adaptation against available QR models. So far, it is possible to associate pre-defined concepts with latent representations of DNNs, but extractable relations are mostly limited to semantic similarity. As a next step towards QR for validation and verification of DNNs: Concretely, we propose a method that extracts the learned superclass hierarchy from a multimodal DNN for a given set of leaf concepts. Under the hood we (1) obtain leaf concept embeddings using the DNN's textual input modality; (2) apply hierarchical clustering to them, using that DNNs encode semantic similarities via vector distances; and (3) label the such-obtained parent concepts using search in available ontologies from QR. An initial evaluation study shows that meaningful ontological class hierarchies can be extracted from state-of-the-art foundation models. Furthermore, we demonstrate how to validate and verify a DNN's learned representations against given ontologies. Lastly, we discuss potential future applications in the context of QR.

翻译：本体论承诺，即所使用的概念、关系与假设，是定性推理模型的理论基石。然而，当前处理原始输入的最先进技术是基于深度神经网络的方法，这类方法如今常构建于多模态基础模型之上。这些模型能够自动学习丰富的概念表征及相应的推理机制。遗憾的是，其习得的定性知识具有不透明性，难以直接检查、验证或与现有定性推理模型进行适配。目前虽可将预定义概念与深度神经网络的潜在表征相关联，但可提取的关系大多局限于语义相似性。作为实现深度神经网络验证与确认的定性推理的关键一步：我们具体提出一种方法，能够针对给定叶概念集合，从多模态深度神经网络中提取其习得的超类层次结构。该方法的核心步骤包括：（1）利用深度神经网络的文本输入模态获取叶概念嵌入；（2）基于深度神经网络通过向量距离编码语义相似性的特性，对嵌入向量进行层次聚类；（3）通过在现有定性推理的本体库中进行搜索，为由此获得的父概念进行标注。初步评估研究表明，能够从最先进的基础模型中提取出具有实际意义的本体类层次结构。此外，我们展示了如何依据给定本体对深度神经网络习得的表征进行验证与确认。最后，我们探讨了该方法在定性推理领域的潜在应用前景。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日