This paper investigates in which cases continuous optimization for directed acyclic graph (DAG) structure learning can and cannot perform well and why this happens, and suggests possible directions to make the search procedure more reliable. Reisach et al. (2021) suggested that the remarkable performance of several continuous structure learning approaches is primarily driven by a high agreement between the order of increasing marginal variances and the topological order, and demonstrated that these approaches do not perform well after data standardization. We analyze this phenomenon for continuous approaches assuming equal and non-equal noise variances, and show that the statement may not hold in either case by providing counterexamples, justifications, and possible alternative explanations. We further demonstrate that nonconvexity may be a main concern especially for the non-equal noise variances formulation, while recent advances in continuous structure learning fail to achieve improvement in this case. Our findings suggest that future works should take into account the non-equal noise variances formulation to handle more general settings and for a more comprehensive empirical evaluation. Lastly, we provide insights into other aspects of the search procedure, including thresholding and sparsity, and show that they play an important role in the final solutions.
翻译:本文探讨了在有向无环图(DAG)结构学习中,连续优化在何种情况下能或不能取得良好表现及其原因,并提出了提升搜索过程可靠性的可能方向。Reisach等人(2021)指出,多种连续结构学习方法的优异表现主要源于边际方差递增顺序与拓扑顺序的高度一致性,并证明这些方法在数据标准化后表现不佳。我们针对等方差与非等方差噪声假设下的连续方法分析了该现象,通过提供反例、论证及可能的替代解释,表明该论断在两种情况下均可能不成立。我们进一步证明,非凸性可能是主要问题——尤其在非等方差噪声模型下,而近期连续结构学习的进展未能改善这一情况。研究结果表明,未来的工作应考虑非等方差噪声模型以处理更通用场景,并进行更全面的实证评估。最后,我们针对搜索过程中的其他要素(如阈值化与稀疏性)提供了洞见,证明这些因素对最终解具有重要影响。