A Recursive Bateson-Inspired Model for the Generation of Semantic Formal Concepts from Spatial Sensory Data

Neural-symbolic approaches to machine learning incorporate the advantages from both connectionist and symbolic methods. Typically, these models employ a first module based on a neural architecture to extract features from complex data. Then, these features are processed as symbols by a symbolic engine that provides reasoning, concept structures, composability, better generalization and out-of-distribution learning among other possibilities. However, neural approaches to the grounding of symbols in sensory data, albeit powerful, still require heavy training and tedious labeling for the most part. This paper presents a new symbolic-only method for the generation of hierarchical concept structures from complex spatial sensory data. The approach is based on Bateson's notion of difference as the key to the genesis of an idea or a concept. Following his suggestion, the model extracts atomic features from raw data by computing elemental sequential comparisons in a stream of multivariate numerical values. Higher-level constructs are built from these features by subjecting them to further comparisons in a recursive process. At any stage in the recursion, a concept structure may be obtained from these constructs and features by means of Formal Concept Analysis. Results show that the model is able to produce fairly rich yet human-readable conceptual representations without training. Additionally, the concept structures obtained through the model (i) present high composability, which potentially enables the generation of 'unseen' concepts, (ii) allow formal reasoning, and (iii) have inherent abilities for generalization and out-of-distribution learning. Consequently, this method may offer an interesting angle to current neural-symbolic research. Future work is required to develop a training methodology so that the model can be tested against a larger dataset.

翻译：神经符号主义机器学习方法融合了联结主义方法和符号主义方法的优势。这类模型通常采用基于神经架构的第一模块从复杂数据中提取特征，随后这些特征由符号引擎作为符号进行加工处理，从而提供推理能力、概念结构、可组合性、更好的泛化性能以及分布外学习等可能性。然而，当前符号在感知数据中的基础化神经方法尽管强大，但大多仍需要密集的训练和繁琐的标注工作。本文提出了一种纯符号方法，用于从复杂空间感知数据中生成层次化概念结构。该方法基于贝特森关于"差异"是观念或概念生成关键要素的核心理念。遵循这一思路，该模型通过在多变量数值数据流中计算基本时序比较，从原始数据中提取原子特征。通过对这些特征进行递归比较，构建更高层次的结构。在递归的任何阶段，都可以通过形式概念分析从这些构建体和特征中获取概念结构。实验结果表明，该模型无需训练即可生成相当丰富且可读性强的概念表征。此外，通过该模型获得的概念结构（i）具有高可组合性，可潜在地生成"未见"概念；（ii）支持形式化推理；（iii）具备内在的泛化和分布外学习能力。因此，该方法可能为当前的神经符号研究提供有趣的新视角。未来需开发训练方法，使模型能够在更大数据集上进行测试。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日