Neurons in the brain communicate information via punctual events called spikes. The timing of spikes is thought to carry rich information, but it is not clear how to leverage this in digital systems. We demonstrate that event-based encoding is efficient for audio compression. To build this event-based representation we use a deep binary auto-encoder, and under high sparsity pressure, the model enters a regime where the binary event matrix is stored more efficiently with sparse matrix storage algorithms. We test this on the large MAESTRO dataset of piano recordings against vector quantized auto-encoders. Not only does our "Spiking Music compression" algorithm achieve a competitive compression/reconstruction trade-off, but selectivity and synchrony between encoded events and piano key strikes emerge without supervision in the sparse regime.
翻译:大脑中的神经元通过称为脉冲的即时事件来传递信息。脉冲的时间被认为携带丰富的信息,但如何在数字系统中利用这一点尚不明确。我们证明,基于事件的编码对于音频压缩是高效的。为了构建这种基于事件的表示,我们使用了深度二值自编码器,在高稀疏度压力下,模型进入一种状态,其中二值事件矩阵通过稀疏矩阵存储算法被更高效地存储。我们在大型钢琴录音数据集MAESTRO上对其进行了测试,并与向量量化自编码器进行了比较。我们的“脉冲音乐压缩”算法不仅实现了有竞争力的压缩/重建权衡,而且在稀疏状态下,编码事件与钢琴按键敲击之间的选择性和同步性无需监督即可出现。