A Theory for Semantic Communications

Semantic communications, as one of the potential key technologies of the sixth generation communications (6G), has attracted research interest from both academia and industry. However, semantic communication is still in its infancy and it faces many challenges, such as semantic information definition and semantic communication measurement. To address these challenges, we investigate unified semantic information measures and semantic channel coding theorem. Specifically, to address the shortcoming of existing semantic entropy definitions can only be applied to specific tasks, we propose a universal semantic entropy definition as the uncertainty in the semantic interpretation of random variable symbols in the context of knowledge bases. The proposed universal semantic entropy not only depends on the probability distribution, but also depends on the specific value of the symbol and the background knowledge base. Under the given conditions, the proposed universal semantic entropy definition can degenerate into the existing semantic entropy and Shannon entropy definitions. Moreover, since the accurate transmission of semantic symbols in the semantic communication system can allow a non-zero bit error rate, we conjecture that the bit rate of the semantic communication may exceed the Shannon channel capacity. Furthermore, we propose a semantic channel coding theorem, and prove its achievability and converse. Since the well-known Fano's inequality cannot be directly applied to semantic communications, we derive and prove the semantic Fano's inequality, and use it to prove the converse. To our best knowledge, this is the first theoretical proof that the transmission rate of semantic communication can exceed the Shannon channel capacity, which provides a theoretical basis for semantic communication research.

翻译：语义通信作为第六代移动通信（6G）的潜在关键技术之一，已引起学术界和工业界的研究兴趣。然而，语义通信仍处于起步阶段，面临诸多挑战，如语义信息定义和语义通信度量问题。为解决这些挑战，我们研究了统一的语义信息度量与语义信道编码定理。具体而言，针对现有语义熵定义仅适用于特定任务的不足，我们提出了一种通用语义熵定义，将其表征为在知识库背景下随机变量符号语义解释的不确定性。该通用语义熵不仅依赖于概率分布，还依赖于符号的具体取值及背景知识库。在给定条件下，所提出的通用语义熵定义可退化为现有语义熵和香农熵定义。此外，由于语义通信系统中语义符号的精确传输允许存在非零误码率，我们推测语义通信的比特率可能超过香农信道容量。进一步，我们提出了语义信道编码定理，并证明了其可达性与逆定理。鉴于著名的Fano不等式无法直接应用于语义通信，我们推导并证明了语义Fano不等式，并利用该不等式完成了逆定理的证明。据我们所知，这是首次从理论上证明语义通信的传输速率可超越香农信道容量，为语义通信研究提供了理论基础。