This paper is dedicated to lossless data compression with probability estimation using neural networks. First, we propose a probability estimation architecture based on a chain of neural predictors, so that each unit of the chain is defined as a neural network with the minimum possible number of weights, which is sufficient for efficient compression of data generated by Markov sources of a given order. We show that this architecture allows us to minimize the overall number of weights participating in the probability estimation process depending on the statistical properties of the input data. Second, in order to improve compression efficiency, we introduce an information inheritance mechanism, where the probability estimate obtained by a low-order unit is used at the next higher-order unit. Experimental results show that the proposed lossless data compressor equipped with the chained probability estimation architecture provides compression ratios close to the state-of-the-art PAC compressor. At the same time, it outperforms PAC by a factor of 1.2 to 6.3 in encoding throughput and by a factor of 2.8 to 12.3 in decoding throughput on a consumer GPU.
翻译:本文致力于利用神经网络进行概率估计以实现无损数据压缩。首先,我们提出一种基于链式神经网络预测器的概率估计架构,其中链中每个单元均被定义为权重数量尽可能少的神经网络。该设计足以对给定阶数马尔可夫源生成的数据进行高效压缩。研究表明,该架构可根据输入数据的统计特性,最小化参与概率估计过程的总体权重数量。其次,为提升压缩效率,我们引入一种信息继承机制:低阶单元获得的概率估计值将作为高阶单元的输入继续传递。实验结果表明,配备所提出的链式概率估计架构的无损数据压缩器,其压缩比接近当前最先进的PAC压缩器。与此同时,在消费级GPU上,该压缩器的编码吞吐量是PAC的1.2至6.3倍,解码吞吐量是PAC的2.8至12.3倍。