Given two sets $R$ and $B$ of $n$ points in the plane, we present efficient algorithms to find a two-line linear classifier that best separates the "red" points in $R$ from the "blue" points in $B$ and is robust to outliers. More precisely, we find a region $\mathcal{W}_B$ bounded by two lines, so either a halfplane, strip, wedge, or double wedge, containing (most of) the blue points $B$, and few red points. Our running times vary between optimal $O(n\log n)$ and around $O(n^3)$, depending on the type of region $\mathcal{W}_B$ and whether we wish to minimize only red outliers, only blue outliers, or both.
翻译:给定平面上的两个点集 $R$ 和 $B$,每个集合包含 $n$ 个点,我们提出了高效的算法,用于寻找一个由两条直线构成的线性分类器,该分类器能最佳地将 $R$ 中的“红色”点与 $B$ 中的“蓝色”点分离,并对离群点具有鲁棒性。更精确地说,我们寻找一个由两条直线界定的区域 $\mathcal{W}_B$(可能为半平面、条带、楔形或双楔形),该区域包含(大部分)蓝色点 $B$,且仅包含少量红色点。我们的算法运行时间在最优的 $O(n\log n)$ 到大约 $O(n^3)$ 之间变化,具体取决于区域 $\mathcal{W}_B$ 的类型以及我们是否希望仅最小化红色离群点、仅最小化蓝色离群点,还是同时最小化两者。