We study the problem of learning a binary classifier on the vertices of a graph. In particular, we consider classifiers given by monophonic halfspaces, partitions of the vertices that are convex in a certain abstract sense. Monophonic halfspaces, and related notions such as geodesic halfspaces,have recently attracted interest, and several connections have been drawn between their properties(e.g., their VC dimension) and the structure of the underlying graph $G$. We prove several novel results for learning monophonic halfspaces in the supervised, online, and active settings. Our main result is that a monophonic halfspace can be learned with near-optimal passive sample complexity in time polynomial in $n = |V(G)|$. This requires us to devise a polynomial-time algorithm for consistent hypothesis checking, based on several structural insights on monophonic halfspaces and on a reduction to $2$-satisfiability. We prove similar results for the online and active settings. We also show that the concept class can be enumerated with delay $\operatorname{poly}(n)$, and that empirical risk minimization can be performed in time $2^{\omega(G)}\operatorname{poly}(n)$ where $\omega(G)$ is the clique number of $G$. These results answer open questions from the literature (Gonz\'alez et al., 2020), and show a contrast with geodesic halfspaces, for which some of the said problems are NP-hard (Seiffarth et al., 2023).
翻译:我们研究在图的顶点上学习二元分类器的问题。具体而言,我们考虑由单音半空间给出的分类器,这些分类器将顶点划分为在某种抽象意义下凸的集合。单音半空间及其相关概念(如测地线半空间)近期引起了广泛关注,并已发现其性质(例如VC维数)与底层图$G$的结构存在若干关联。我们针对监督学习、在线学习和主动学习场景中学习单音半空间的问题,证明了若干新结果。主要结果表明:可以在时间多项式于$n = |V(G)|$内,以接近最优的被动样本复杂度学习单音半空间。这要求我们基于对单音半空间的结构洞察以及将其归约为2-可满足性问题,设计出多项式时间的一致性假设检验算法。我们针对在线学习和主动学习场景也证明了类似结果。我们还证明该概念类可以在延迟时间$\operatorname{poly}(n)$内枚举,并且经验风险最小化可在时间$2^{\omega(G)}\operatorname{poly}(n)$内完成(其中$\omega(G)$是$G$的团数)。这些结果回答了文献中的公开问题(González 等,2020),并与测地线半空间形成对比——后者在某些问题上是NP困难的(Seiffarth 等,2023)。