Dependence among marginally constrained observations can break a finite-sample barrier. To formalize this phenomenon, we introduce the \emph{minimum list entropy coupling} $H(P\|Q_1,\dots,Q_m)$, the minimum conditional entropy $H(X|Y_1,\dots,Y_m)$ over all joint distributions with prescribed discrete marginals $X\sim P$ and $Y_i\sim Q_i$. Unlike classical formulations based on independent observations, our model allows $Y_1,\dots,Y_m$ to be arbitrarily dependent while keeping each marginal fixed. This enlarged coupling space reveals a sharp dichotomy: independent observations reduce residual uncertainty exponentially, whereas dependent observations can eliminate it exactly after finitely many samples. We characterize this zero-entropy regime through necessary and sufficient conditions and give concrete structural criteria under which it occurs. In particular, under mild support assumptions, zero entropy is achieved with $O(\log(1/P_{\min}))$ observations, where $P_{\min}$ is the minimum nonzero mass of $P$. We also develop a greedy algorithm with monotone approximation guarantees for computing $H(P\|Q_1,\dots,Q_m)$. Finally, we show that the same framework formalizes finite-sample limits in distribution-matching representation learning and randomness extraction, where zero entropy corresponds to exact recovery and exact extraction.
翻译:边际约束观测之间的依赖性可打破有限样本障碍。为形式化这一现象,我们引入\emph{最小列表熵耦合} $H(P\|Q_1,\dots,Q_m)$,即在所有具有指定离散边际分布 $X\sim P$ 和 $Y_i\sim Q_i$ 的联合分布上的最小条件熵 $H(X|Y_1,\dots,Y_m)$。与基于独立观测的经典公式不同,我们的模型允许 $Y_1,\dots,Y_m$ 任意依赖,同时保持每个边际分布固定。这一扩大的耦合空间揭示了一个锐利的分界线:独立观测使残差不确定性呈指数级减少,而依赖观测可在有限样本后精确消除不确定性。我们通过充要条件刻画了这一零熵区域,并给出了其发生的具体结构性准则。特别地,在温和支撑假设下,用 $O(\log(1/P_{\min}))$ 个观测即可达到零熵,其中 $P_{\min}$ 是 $P$ 的最小非零概率质量。我们还开发了一种具有单调逼近保证的贪心算法,用于计算 $H(P\|Q_1,\dots,Q_m)$。最后,我们证明该框架形式化了分布匹配表示学习和随机性提取中的有限样本极限,其中零熵对应精确恢复和精确提取。