We introduce SymbolicAI, a versatile and modular framework employing a logic-based approach to concept learning and flow management in generative processes. SymbolicAI enables the seamless integration of generative models with a diverse range of solvers by treating large language models (LLMs) as semantic parsers that execute tasks based on both natural and formal language instructions, thus bridging the gap between symbolic reasoning and generative AI. We leverage probabilistic programming principles to tackle complex tasks, and utilize differentiable and classical programming paradigms with their respective strengths. The framework introduces a set of polymorphic, compositional, and self-referential operations for data stream manipulation, aligning LLM outputs with user objectives. As a result, we can transition between the capabilities of various foundation models endowed with zero- and few-shot learning capabilities and specialized, fine-tuned models or solvers proficient in addressing specific problems. In turn, the framework facilitates the creation and evaluation of explainable computational graphs. We conclude by introducing a quality measure and its empirical score for evaluating these computational graphs, and propose a benchmark that compares various state-of-the-art LLMs across a set of complex workflows. We refer to the empirical score as the "Vector Embedding for Relational Trajectory Evaluation through Cross-similarity", or VERTEX score for short. The framework codebase and benchmark are linked below.
翻译:我们提出SymbolicAI,一种通用且模块化的框架,采用基于逻辑的方法进行概念学习与生成流程管理。通过将大语言模型视为语义解析器,依据自然语言和形式语言指令执行任务,SymbolicAI实现了生成模型与各类求解器的无缝集成,从而弥合符号推理与生成式人工智能之间的鸿沟。我们利用概率编程原理处理复杂任务,并融合可微编程与经典编程范式各自优势。该框架引入一组多态、组合及自指操作以操控数据流,使大语言模型输出与用户目标对齐。由此,我们可在具备零样本与少样本学习能力的多种基础模型与专精特定问题的微调模型或求解器之间灵活切换。该框架进而支持可解释计算图的创建与评估。最后,我们提出一种评估这些计算图的质量指标及其经验评分,并设计一个基准测试,在若干复杂工作流中比较多种前沿大语言模型。我们将该经验评分称为“通过交叉相似性的关系轨迹向量嵌入评估”(简称VERTEX评分)。框架代码库与基准测试链接如下。