We introduce a class of paired binary matrices called admixed arrays, which arise in analyses of large-scale genetic data and can be viewed as weighted edge colorings of complete bipartite graphs. This combinatorial structure gives rise to two natural families of marginal constraints: a row-sum constraint and a paired column-sum constraint, the latter inducing an inequality among entries of the matrix pair. We study the enumeration of admixed arrays under these constraints in dense regimes. First, we obtain exact formulas for the sizes of the families defined by each constraint in isolation and derive a finite-size criterion characterizing when one constraint is more restrictive than the other. In the large-dimension limit, this comparison simplifies to an entropy inequality, yielding an information-theoretic interpretation and a quantifiable error bound in the semi-regular case. We then analyze the asymptotic enumeration of the doubly constrained family in a semi-regular setting. Using saddle-point approximation and probabilistic techniques, we derive a detailed asymptotic expansion for the logarithm of the count, isolating an explicit fourth-moment contribution and establishing quantitative control of the higher-order remainder. A consequence of this analysis is a phenomenon absent from classical binary and integer matrix models: in the regime $N=Θ(P)$ with uniform margins and density bounded away from zero, the two constraint families obey the independence heuristic with a correction factor $1/\sqrt[4]{e}$ rather than the familiar $e^{\pm1/2}$. Numerical experiments corroborate the analytical approximations, and we implement and extend an algorithm of Miller and Harrison (2013) as open-source software to enumerate constrained admixed arrays.
翻译:我们引入一类称为混合数组的配对二值矩阵,这类矩阵出现在大规模遗传数据分析中,可视为完全二部图上的加权边着色。该组合结构引出了两类自然边际约束:行和约束与配对列和约束,后者在矩阵对的元素间诱导出不等式关系。我们研究稠密约束条件下混合数组的计数问题。首先,分别获得单类约束所定义族的精确计数公式,并推导出有限尺寸判据,用以刻画一类约束何时比另一类更具约束力。在大尺寸极限下,该比较简化为熵不等式,从而在近乎正则情形下获得信息论解释与可量化的误差界。随后,我们分析近乎正则设置下双重约束族的渐近计数。运用鞍点近似与概率方法,推导出计数对数的详细渐近展开式,分离出明确的四阶矩贡献项,并对高阶余项建立定量控制。该分析揭示了一个经典二值矩阵与整数矩阵模型中不存在的现象:在边界密度远离零的均匀边际约束下,当参数满足$N=Θ(P)$时,两类约束族服从带修正因子$1/\sqrt[4]{e}$的独立性启发式方法,而非常见的$e^{\pm1/2}$。数值实验验证了分析近似,我们还将Miller与Harrison(2013)的算法实现并扩展为开源软件,用于计数约束混合数组。