This paper studies a joint data and semantics lossy compression problem in the finite blocklength regime, where the data and semantic sources are correlated, and only the data source can be observed by the encoder. We first introduce an information-theoretic nonasymptotic analysis framework to investigate the nonasymptotic fundamental limits of our studied problem. Within this framework, general nonasymptotic achievability bounds valid for general sources and distortion measures are derived. Moreover, we provide a second-order achievability bound in the standard block coding setting by applying the two-dimensional Berry-Esseen theorem to our nonasymptotic bounds. Compared with first-order asymptotic bounds, our results have the potential to provide unique insights for the design of practical semantic communication systems.
翻译:本文研究有限块长机制下的联合数据与语义有损压缩问题,其中数据源与语义源具有相关性,且编码器仅能观测数据源。首先引入信息论非渐近分析框架,以探究所研究问题的非渐近基础极限。在该框架下,推导了适用于一般信源及失真测度的通用非渐近可达性界。进一步,通过将二维Berry-Esseen定理应用于非渐近界,给出了标准块编码场景下的二阶可达性界。与一阶渐近界相比,本文结果有望为实际语义通信系统的设计提供独特见解。