Given two sets $\mathit{R}$ and $\mathit{B}$ of at most $\mathit{n}$ points in the plane, we present efficient algorithms to find a two-line linear classifier that best separates the "red" points in $\mathit{R}$ from the "blue" points in $B$ and is robust to outliers. More precisely, we find a region $\mathit{W}_\mathit{B}$ bounded by two lines, so either a halfplane, strip, wedge, or double wedge, containing (most of) the blue points $\mathit{B}$, and few red points. Our running times vary between optimal $O(n\log n)$ and $O(n^4)$, depending on the type of region $\mathit{W}_\mathit{B}$ and whether we wish to minimize only red outliers, only blue outliers, or both.
翻译:给定平面上最多包含 $\mathit{n}$ 个点的两个集合 $\mathit{R}$ 和 $\mathit{B}$,我们提出了高效算法,用于寻找能最佳分离 $\mathit{R}$ 中的"红色"点与 $\mathit{B}$ 中的"蓝色"点,且对异常值具有鲁棒性的两直线线性分类器。具体而言,我们找到一个由两条直线围成的区域 $\mathit{W}_\mathit{B}$(可以是半平面、条带、楔形或双楔形),该区域包含(大多数)蓝色点 $\mathit{B}$ 且仅含少量红色点。根据区域 $\mathit{W}_\mathit{B}$ 的类型以及我们是希望仅最小化红色异常值、仅最小化蓝色异常值还是同时最小化两者,算法运行时间介于最优的 $O(n\log n)$ 到 $O(n^4)$ 之间。