Linear Codes for Hyperdimensional Computing

Hyperdimensional Computing (HDC) is an emerging computational paradigm for representing compositional information as high-dimensional vectors, and has a promising potential in applications ranging from machine learning to neuromorphic computing. One of the long-standing challenges in HDC is factoring a compositional representation to its constituent factors, also known as the recovery problem. In this paper we take a novel approach to solve the recovery problem, and propose the use of random linear codes. These codes are subspaces over the Boolean field, and are a well-studied topic in information theory with various applications in digital communication. We begin by showing that hyperdimensional encoding using random linear codes retains favorable properties of the prevalent (ordinary) random codes, and hence HD representations using the two methods have comparable information storage capabilities. We proceed to show that random linear codes offer a rich subcode structure that can be used to form key-value stores, which encapsulate most use cases of HDC. Most importantly, we show that under the framework we develop, random linear codes admit simple recovery algorithms to factor (either bundled or bound) compositional representations. The former relies on constructing certain linear equation systems over the Boolean field, the solution to which reduces the search space dramatically and strictly outperforms exhaustive search in many cases. The latter employs the subspace structure of these codes to achieve provably correct factorization. Both methods are strictly faster than the state-of-the-art resonator networks, often by an order of magnitude. We implemented our techniques in Python using a benchmark software library, and demonstrated promising experimental results.

翻译：超维计算（HDC）是一种新兴的计算范式，通过高维向量来表征组合信息，在机器学习到神经形态计算等应用领域具有广阔前景。HDC长期面临的挑战之一是如何将组合表征分解为其构成因子，即恢复问题。本文采用新颖方法解决该恢复问题，提出使用随机线性码。这些码是布尔域上的子空间，是信息论中研究成熟的主题，在数字通信中有多种应用。我们首先证明，使用随机线性码的超维编码保留了主流（普通）随机码的优良特性，因此两种方法生成的HD表征具有相当的信息存储能力。继而证明，随机线性码具有丰富的子码结构，可用于构建键值存储，涵盖HDC的大多数应用场景。最重要的是，在本文框架下，随机线性码支持简单的恢复算法来分解（捆绑或绑定）组合表征：前者依赖于在布尔域上构建特定线性方程组，其解能显著缩小搜索空间，在许多情况下严格优于穷举搜索；后者利用码的子空间结构实现可证明正确的分解。两种方法均严格快于最先进的谐振子网络，通常快一个数量级。我们使用基准软件库在Python中实现所提技术，并展示了具有前景的实验结果。