We propose a Semantic Ordered Statistics Decoder (sem-OSD), a soft decoder for short linear block codes carrying byte-streamed sources such as natural-language text. Sem-OSD injects a byte-level language-model (LM) prior into ordered statistics decoding (OSD) through a fused bit-level score that combines channel reliability with the LM prior, and uses it for the most-reliable basis (MRB) selection and the codeword candidate scoring. Sem-OSD enumerates two complementary test-error-pattern (TEP) families: a bit-flip family that flips up to $m$ bits, and an LM-driven family of up to $ω$ byte substitutions that reaches error patterns the bit-flip family cannot. The LM prior is computed by a byte-level Transformer fine-tuned for byte-level denoising. Simulation results show that, on AWGN, sem-OSD achieves block error rates (BLERs) below the finite-blocklength normal-approximation bound for uniform sources on both binary BCH$(127,64)$ and shortened RS$(16,8)$ over GF(256), exceeding Fossorier OSD by a $1.5$ dB coding gain. On a Gilbert--Elliott burst-error channel, sem-OSD provides $4$ dB and $1$ dB of more coding gain than Berlekamp--Massey and OSD, respectively.
翻译:暂无翻译