This paper presents a semantic-enhanced receiver framework for transmitting natural language sentences over noisy wireless channels using multiple short block codes. After ASCII encoding, the sentence is divided into segments, each independently encoded with a short block code and transmitted over an AWGN channel. At the receiver, segments are decoded in parallel, followed by a semantic error correction (SEC) model, which reconstructs corrupted segments using language model context. We further propose the semantic list decoding (SLD), which generates multiple candidate reconstructions and selects the best one via weighted Hamming distance, and a semantic confidence-guided HARQ (SHARQ) mechanism that replaces CRC-based error detection with a confidence score, enabling selective segment retransmission without CRC overhead. All modules are designed and trained using bidirectional and auto-regressive transformers (BART). Simulation results demonstrate that the proposed scheme significantly outperforms conventional capacity-approaching short codes and long codes at the same rate. Specifically, SEC provides approximately 0.4 dB BLER gain over plain short-code transmission, while SLD extends this to 0.8 dB. Compared to transmitting the entire sentence as a single long 5G LDPC codeword, our approach significantly improves semantic fidelity and reduces decoding latency by up to 90\%. SHARQ further provides an additional 1.5 dB gain over conventional HARQ.
翻译:本文针对多短分组码在噪声信道上传输自然语言句子的场景,提出一种语义增强型接收机框架。句子经ASCII编码后划分为若干段,每段独立采用短分组码编码,通过AWGN信道传输。接收端对分段进行并行译码,随后使用语义纠错(SEC)模型,该模型利用语言模型上下文重建受损的译码段。我们进一步提出语义列表译码(SLD)方法,生成多个候选重建结果,并通过加权汉明距离选择最优解;同时提出语义置信引导HARQ(SHARQ)机制,用置信度评分替代基于CRC的错误检测,无需CRC开销即可实现选择性分段重传。所有模块均采用双向自回归变换器(BART)设计和训练。仿真结果表明,在相同码率下,所提方案显著优于传统逼近容量限的短码和长码。具体而言,相比纯短分组码传输,SEC带来约0.4 dB的BLER增益,SLD则将其提升至0.8 dB。与将整个句子作为单个5G LDPC长码字传输相比,本方法显著提升语义保真度,并将译码延迟降低高达90%。SHARQ机制相较传统HARQ额外提供1.5 dB增益。