The quality and quantity of data used for training greatly influence the performance and effectiveness of deep learning models. In the context of error correction, it is essential to generate high-quality samples that are neither excessively noisy nor entirely correct but close to the decoding region's decision boundary. To accomplish this objective, this paper utilizes a restricted version of a recent result on Importance Sampling (IS) distribution for fast performance evaluation of linear codes. The IS distribution is used over the segmented observation space and integrated with active learning. This combination allows for the iterative generation of samples from the shells whose acquisition functions, defined as the error probabilities conditioned on each shell, fall within a specific range. By intelligently sampling based on the proposed IS distribution, significant improvements are demonstrated in the performance of BCH(63,36) and BCH(63,45) codes with cycle-reduced parity-check matrices. The proposed IS-based-active Weight Belief Propagation (WBP) decoder shows improvements of up to 0.4dB in the waterfall region and up to 1.9dB in the error-floor region of the BER curve, over the conventional WBP. This approach can be easily adapted to generate efficient samples to train any other deep learning-based decoder.
翻译:用于训练的数据质量与数量深刻影响着深度学习模型的性能与效果。在纠错背景下,生成既不过度噪声化也非完全正确、且靠近解码区域决策边界的高质量样本至关重要。为实现这一目标,本文利用重要性采样(IS)分布在快速评估线性码性能方面的最新研究成果的受限版本。该IS分布被应用于分段观测空间,并与主动学习相结合。这种组合允许从错误概率(定义为每个壳层的条件概率)获取函数落在特定范围内的壳层中迭代生成样本。通过基于所提出的IS分布进行智能采样,实验表明,在使用循环缩减奇偶校验矩阵的BCH(63,36)和BCH(63,45)码上取得了显著性能提升。基于IS的主动权重置信传播(WBP)解码器在误码率曲线的水瀑区相比传统WBP提升高达0.4dB,在错误平层区提升高达1.9dB。该方法可轻松适配生成高效样本,以训练任何其他基于深度学习的解码器。