We study the problem of learning a binary classifier on the vertices of a graph. In particular, we consider classifiers given by monophonic halfspaces, partitions of the vertices that are convex in a certain abstract sense. Monophonic halfspaces, and related notions such as geodesic halfspaces,have recently attracted interest, and several connections have been drawn between their properties(e.g., their VC dimension) and the structure of the underlying graph $G$. We prove several novel results for learning monophonic halfspaces in the supervised, online, and active settings. Our main result is that a monophonic halfspace can be learned with near-optimal passive sample complexity in time polynomial in $n = |V(G)|$. This requires us to devise a polynomial-time algorithm for consistent hypothesis checking, based on several structural insights on monophonic halfspaces and on a reduction to $2$-satisfiability. We prove similar results for the online and active settings. We also show that the concept class can be enumerated with delay $\operatorname{poly}(n)$, and that empirical risk minimization can be performed in time $2^{\omega(G)}\operatorname{poly}(n)$ where $\omega(G)$ is the clique number of $G$. These results answer open questions from the literature (Gonz\'alez et al., 2020), and show a contrast with geodesic halfspaces, for which some of the said problems are NP-hard (Seiffarth et al., 2023).
翻译:本研究探讨在图的顶点上学习二分类器的问题。具体而言,我们关注由单通道半空间给出的分类器——即在某种抽象意义下具有凸性的顶点划分。单通道半空间及其相关概念(如测地半空间)近期受到学界关注,其性质(例如VC维数)与底层图$G$结构之间的关联已得到多方面阐释。我们在监督学习、在线学习和主动学习三种场景下,针对单通道半空间的学习问题提出了若干新结论。我们的核心成果是:单通道半空间可在$n = |V(G)|$的多项式时间内,以接近最优的被动样本复杂度完成学习。这需要我们设计一个基于单通道半空间结构特性分析及$2$-可满足性归约的一致性假设检验多项式时间算法。我们在在线学习和主动学习场景中证明了类似结论。此外,我们证明了该概念类可在$\operatorname{poly}(n)$延迟时间内完成枚举,且经验风险最小化可在$2^{\omega(G)}\operatorname{poly}(n)$时间内实现(其中$\omega(G)$为图$G$的团数)。这些结果解答了文献中的开放性问题(Gonz\'alez等人,2020),并揭示了与测地半空间的显著差异——后者涉及的若干问题已被证明是NP难解的(Seiffarth等人,2023)。