High-quality, multi-channel neural recording is indispensable for neuroscience research and clinical applications. Large-scale brain recordings often produce vast amounts of data that must be wirelessly transmitted for subsequent offline analysis and decoding, especially in brain-computer interfaces (BCIs) utilizing high-density intracortical recordings with hundreds or thousands of electrodes. However, transmitting raw neural data presents significant challenges due to limited communication bandwidth and resultant excessive heating. To address this challenge, we propose a neural signal compression scheme utilizing Convolutional Autoencoders (CAEs), which achieves a compression ratio of up to 150 for compressing local field potentials (LFPs). The CAE encoder section is implemented on RAMAN, an energy-efficient tinyML accelerator designed for edge computing, and subsequently deployed on an Efinix Ti60 FPGA with 37.3k LUTs and 8.6k register utilization. RAMAN leverages sparsity in activation and weights through zero skipping, gating, and weight compression techniques. Additionally, we employ hardware-software co-optimization by pruning CAE encoder model parameters using a hardware-aware balanced stochastic pruning strategy, resolving workload imbalance issues and eliminating indexing overhead to reduce parameter storage requirements by up to 32.4%. Using the proposed compact depthwise separable convolutional autoencoder (DS-CAE) model, the compressed neural data from RAMAN is reconstructed offline with superior signal-to-noise and distortion ratios (SNDR) of 22.6 dB and 27.4 dB, along with R2 scores of 0.81 and 0.94, respectively, evaluated on two monkey neural recordings.
翻译:高质量多通道神经记录对于神经科学研究与临床应用不可或缺。大规模脑信号记录常产生海量数据,需通过无线传输进行后续离线分析与解码,这在采用数百至数千个高密度皮层内电极记录的脑机接口中尤为关键。然而,原始神经数据传输因通信带宽受限及引发的过度发热问题面临重大挑战。为此,我们提出一种基于卷积自编码器的神经信号压缩方案,在压缩局部场电位时最高可实现150倍的压缩比。CAE编码器模块部署于专为边缘计算设计的能效型tinyML加速器RAMAN,并进一步在配备37.3k LUT与8.6k寄存器的Efinix Ti60 FPGA上实现。RAMAN通过零值跳过、门控及权重压缩技术利用激活值与权重的稀疏性。此外,我们采用硬件-软件协同优化策略,通过硬件感知的平衡随机剪枝方法对CAE编码器参数进行剪枝,解决了计算负载不均衡问题并消除索引开销,使参数存储需求降低达32.4%。基于所提出的紧凑型深度可分离卷积自编码器模型,RAMAN输出的压缩神经数据在离线重建时表现出优异的信号噪声失真比(22.6 dB与27.4 dB)及R2分数(0.81与0.94),该结果通过在两种猴类神经记录数据集上的评估获得。