In this paper, we lay the groundwork on the comparison of phylogenetic networks based on edge contractions and expansions as edit operations, as originally proposed by Robinson and Foulds to compare trees. We prove that these operations connect the space of all phylogenetic networks on the same set of leaves, even if we forbid contractions that create cycles. This allows to define an operational distance on this space, as the minimum number of contractions and expansions required to transform one network into another. We highlight the difference between this distance and the computation of the maximum common contraction between two networks. Given its ability to outline a common structure between them, which can provide valuable biological insights, we study the algorithmic aspects of the latter. We first prove that computing a maximum common contraction between two networks is NP-hard, even when the maximum degree, the size of the common contraction, or the number of leaves is bounded. We also provide lower bounds to the problem based on the Exponential-Time Hypothesis. Nonetheless, we do provide a polynomial-time algorithm for weakly-galled networks, a generalization of galled trees.
翻译:本文基于边收缩与扩展作为编辑操作(最初由Robinson和Foulds提出用于比较树结构),为系统发育网络的比较奠定理论基础。我们证明,即使禁止产生环的收缩操作,这些操作仍能连接具有相同叶节点集合的所有系统发育网络空间。基于此,可在此空间上定义操作距离,即将一个网络转换为另一个网络所需的最小收缩与扩展次数。我们重点分析了该距离与计算两个网络间最大公共收缩问题的区别。鉴于最大公共收缩能勾勒网络间的共同结构(可提供重要生物学洞见),我们重点研究后者的算法特性。首先证明即使限制最大度数、公共收缩规模或叶节点数量,计算两个网络间的最大公共收缩仍是NP难问题,并基于指数时间假说给出问题的下界。尽管如此,我们针对弱瘿网络(瘿树的推广形式)提出了多项式时间算法。