A composite source, consisting of multiple subsources and a memoryless switch, outputs one symbol at a time from the subsource selected by the switch. If some data should be encoded more accurately than other data from an information source, the composite source model is suitable because in this model different distortion constraints can be put on the subsources. In this context, we propose subsource-dependent fidelity criteria for composite sources and use them to formulate a rate-distortion problem. We solve the problem and obtain a single-letter expression for the rate-distortion function. Further rate-distortion analysis characterizes the performance of classify-then-compress (CTC) coding, which is frequently used in practice when subsource-dependent fidelity criteria are considered. Our analysis shows that CTC coding generally has performance loss relative to optimal coding, even if the classification is perfect. We also identify the cause of the performance loss, that is, class labels have to be reproduced in CTC coding. Last but not least, we show that the performance loss is negligible for asymptotically small distortion if CTC coding is appropriately designed and some mild conditions are satisfied.
翻译:复合信源由多个子源和无记忆开关组成,每次从开关选中的子源输出一个符号。若信息源中的某些数据需要比其他数据更精确地编码,复合信源模型因其可对子源施加不同失真约束的特性而适用。在此背景下,我们针对复合信源提出了基于子源的保真度准则,并据此构建了率失真问题。通过求解该问题,获得了率失真函数的单字母表达式。进一步的率失真分析刻画了"先分类后压缩"(CTC)编码的性能——这种编码方式在实际考虑子源保真度准则时被广泛采用。分析表明,即使分类完全正确,CTC编码相对于最优编码仍存在性能损失。我们还揭示了性能损失的根源在于CTC编码需要复现类别标签。最后但同样重要的是,我们证明:当CTC编码设计得当且满足某些温和条件时,在渐进小失真条件下性能损失可忽略不计。