This paper presents a semantic-enhanced receiver framework for transmitting natural language sentences over noisy wireless channels using multiple short block codes. After ASCII encoding, the sentence is divided into segments, each independently encoded with a short block code and transmitted over an AWGN channel. At the receiver, segments are decoded in parallel, followed by a semantic error correction (SEC) model, which reconstructs corrupted segments using language model context. We further propose the semantic list decoding (SLD), which generates multiple candidate reconstructions and selects the best one via weighted Hamming distance, and a semantic confidence-guided HARQ (SHARQ) mechanism that replaces CRC-based error detection with a confidence score, enabling selective segment retransmission without CRC overhead. All modules are designed and trained using bidirectional and auto-regressive transformers (BART). Simulation results demonstrate that the proposed scheme significantly outperforms conventional capacity-approaching short codes and long codes at the same rate. Specifically, SEC provides approximately 0.4 dB BLER gain over plain short-code transmission, while SLD extends this to 0.8 dB. Compared to transmitting the entire sentence as a single long 5G LDPC codeword, our approach significantly improves semantic fidelity and reduces decoding latency by up to 90\%. SHARQ further provides an additional 1.5 dB gain over conventional HARQ.
翻译:本文提出了一种语义增强型接收机框架,用于通过噪声无线信道传输自然语言句子,并采用多个短分组码进行编码。经ASCII编码后,句子被分割为若干片段,各片段独立以短分组码编码后通过高斯白噪声(AWGN)信道传输。接收端并行译码后,接入语义纠错(SEC)模型,该模型利用语言模型上下文对受损片段进行重构。我们进一步提出语义列表译码(SLD),通过生成多个候选重构结果并基于加权汉明距离选取最优解;同时设计语义引导型HARQ(SHARQ)机制,以置信度分数替代基于CRC的检错方法,在无需CRC开销的前提下实现选择性片段重传。所有模块均采用双向自回归变换器(BART)进行设计与训练。仿真结果表明,在相同码率条件下,所提方案显著优于传统逼近香农极限的短码及长码。具体而言,SEC相比纯短码传输可提供约0.4 dB的BLER增益,SLD进一步将增益提升至0.8 dB。相较将整句编码为单一长5G LDPC码字的方案,本方法在提升语义保真度的同时,将译码延迟降低达90%。此外,SHARQ相比传统HARQ可额外获得1.5 dB的增益。