The quality and quantity of data used for training greatly influence the performance and effectiveness of deep learning models. In the context of error correction, it is essential to generate high-quality samples that are neither excessively noisy nor entirely correct but close to the decoding region's decision boundary. To accomplish this objective, this paper utilizes a restricted version of a recent result on Importance Sampling (IS) distribution for fast performance evaluation of linear codes. The IS distribution is used over the segmented observation space and integrated with active learning. This combination allows for the iterative generation of samples from the shells whose acquisition functions, defined as the error probabilities conditioned on each shell, fall within a specific range. By intelligently sampling based on the proposed IS distribution, significant improvements are demonstrated in the performance of BCH(63,36) and BCH(63,45) codes with cycle-reduced parity-check matrices. The proposed IS-based-active Weight Belief Propagation (WBP) decoder shows improvements of up to 0.4dB in the waterfall region and up to 1.9dB in the error-floor region of the BER curve, over the conventional WBP. This approach can be easily adapted to generate efficient samples to train any other deep learning-based decoder.
翻译:训练数据的质量与数量深刻影响着深度学习模型的性能与有效性。在纠错领域,生成既不过度含噪也非完全正确、且接近解码区域决策边界的高质量样本至关重要。为实现此目标,本文采用了近期重要性采样(IS)分布研究成果的受限版本,用于快速评估线性码性能。该IS分布在分段观测空间上应用,并与主动学习相结合。这种组合使研究者能够迭代生成来自各“壳层”的样本,其采集函数(定义为各壳层条件误差概率)处于特定范围内。通过基于所提IS分布的智能采样,采用循环缩减校验矩阵的BCH(63,36)和BCH(63,45)码性能得到显著提升。与常规权重置信传播(WBP)解码器相比,所提出的基于IS的主动权重置信传播(WBP)解码器在误码率曲线的瀑布区提升达0.4dB,在错误平层区提升达1.9dB。该方法可轻松适配以生成高效样本,用于训练任何其他基于深度学习的解码器。