BriLLM：类脑大语言模型 (BriLLM: Brain-inspired Large Language Model)

This paper reports the first brain-inspired large language model (BriLLM). This is a non-Transformer, non-GPT, non-traditional machine learning input-output controlled generative language model. The model is based on the Signal Fully-connected flowing (SiFu) definition on the directed graph in terms of the neural network, and has the interpretability of all nodes on the graph of the whole model, instead of the traditional machine learning model that only has limited interpretability at the input and output ends. In the language model scenario, the token is defined as a node in the graph. A randomly shaped or user-defined signal flow flows between nodes on the principle of "least resistance" along paths. The next token or node to be predicted or generated is the target of the signal flow. As a language model, BriLLM theoretically supports infinitely long $n$-gram models when the model size is independent of the input and predicted length of the model. The model's working signal flow provides the possibility of recall activation and innate multi-modal support similar to the cognitive patterns of the human brain. At present, we released the first BriLLM version in Chinese, with 4000 tokens, 32-dimensional node width, 16-token long sequence prediction ability, and language model prediction performance comparable to GPT-1. More computing power will help us explore the infinite possibilities depicted above.

翻译：本文报告了首个类脑大语言模型（BriLLM）。这是一种非Transformer、非GPT、非传统机器学习输入输出控制的生成式语言模型。该模型基于有向图上的信号全连接流（SiFu）定义构建神经网络，使得整个模型图中所有节点均具有可解释性，而非如传统机器学习模型仅具备输入与输出端的有限可解释性。在语言模型场景中，词元被定义为图中的节点。随机形态或用户定义的信号流遵循“最小阻力”原则沿路径在节点间流动。待预测或生成的下一个词元或节点即为信号流的目标。作为语言模型，当模型规模独立于输入及预测长度时，BriLLM在理论上支持无限长的$n$-元语法模型。该模型的工作信号流提供了类似人脑认知模式的回忆激活与先天多模态支持的可能性。目前，我们发布了首个中文版BriLLM，具备4000词元容量、32维节点宽度、16词元长序列预测能力，其语言模型预测性能可与GPT-1相媲美。更多计算资源将助力我们探索上述描绘的无限可能。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日