We introduce SymbolicAI, a versatile and modular framework employing a logic-based approach to concept learning and flow management in generative processes. SymbolicAI enables the seamless integration of generative models with a diverse range of solvers by treating large language models (LLMs) as semantic parsers that execute tasks based on both natural and formal language instructions, thus bridging the gap between symbolic reasoning and generative AI. We leverage probabilistic programming principles to tackle complex tasks, and utilize differentiable and classical programming paradigms with their respective strengths. The framework introduces a set of polymorphic, compositional, and self-referential operations for multi-modal data that connects multi-step generative processes and aligns their outputs with user objectives in complex workflows. As a result, we can transition between the capabilities of various foundation models with in-context learning capabilities and specialized, fine-tuned models or solvers proficient in addressing specific problems. Through these operations based on in-context learning our framework enables the creation and evaluation of explainable computational graphs. Finally, we introduce a quality measure and its empirical score for evaluating these computational graphs, and propose a benchmark that compares various state-of-the-art LLMs across a set of complex workflows. We refer to the empirical score as the "Vector Embedding for Relational Trajectory Evaluation through Cross-similarity", or VERTEX score for short. The framework codebase and benchmark are linked below.
翻译:本文介绍SymbolicAI,一个采用逻辑驱动方法的通用模块化框架,用于生成过程中的概念学习与流程管理。该框架通过将大语言模型(LLMs)视为能够根据自然语言和形式语言指令执行任务的语义解析器,实现了生成模型与多样化求解器的无缝集成,从而弥合了符号推理与生成式人工智能之间的鸿沟。我们利用概率编程原理处理复杂任务,并综合运用可微分编程与经典编程范式以发挥各自优势。该框架引入了一组面向多模态数据的多态性、组合性与自指涉操作,这些操作能够连接多步骤生成流程,并在复杂工作流中将输出结果与用户目标对齐。基于此,我们能够在具备上下文学习能力的各类基础模型与擅长处理特定问题的专业化微调模型或求解器之间灵活切换。通过这些基于上下文学习的操作,本框架支持可解释计算图的构建与评估。最后,我们提出了一种用于评估此类计算图的质量度量指标及其经验评分,并构建了一个基准测试来比较多种前沿大语言模型在复杂工作流中的表现。我们将该经验评分称为“基于交叉相似度的关系轨迹评估向量嵌入”,简称VERTEX评分。框架代码库与基准测试链接如下。