An information-theoretic framework is introduced to analyze last-layer embedding, focusing on learned representations for regression tasks. We define representation-rate and derive limits on the reliability with which input-output information can be represented as is inherently determined by the input-source entropy. We further define representation capacity in a perturbed setting, and representation rate-distortion for a compressed output. We derive the achievable capacity, the achievable representation-rate, and their converse. Finally, we combine the results in a unified setting.
翻译:本文引入一个信息论框架来分析最后一层嵌入,重点关注回归任务中的学习表示。我们定义了表示率,并推导了输入-输出信息能够被表示的可靠性极限,这一极限本质上由输入源熵决定。进一步,我们在扰动设置中定义了表示容量,并为压缩输出定义了表示率-失真函数。我们推导了可达容量、可达表示率及其逆命题。最后,我们将这些结果整合到一个统一的框架中。