Regular object detection methods output rectangle bounding boxes, which are unable to accurately describe the actual object shapes. Instance segmentation methods output pixel-level labels, which are computationally expensive for real-time applications. Therefore, a polygon representation is needed to achieve precise shape alignment, while retaining low computation cost. We develop a novel Deformable Polar Polygon Object Detection method (DPPD) to detect objects in polygon shapes. In particular, our network predicts, for each object, a sparse set of flexible vertices to construct the polygon, where each vertex is represented by a pair of angle and distance in the Polar coordinate system. To enable training, both ground truth and predicted polygons are densely resampled to have the same number of vertices with equal-spaced raypoints. The resampling operation is fully differentable, allowing gradient back-propagation. Sparse polygon predicton ensures high-speed runtime inference while dense resampling allows the network to learn object shapes with high precision. The polygon detection head is established on top of an anchor-free and NMS-free network architecture. DPPD has been demonstrated successfully in various object detection tasks for autonomous driving such as traffic-sign, crosswalk, vehicle and pedestrian objects.
翻译:常规目标检测方法输出矩形边界框,无法精确描述实际物体形状。实例分割方法输出像素级标签,在实时应用中计算开销较大。因此,需采用多边形表示以在保持低计算成本的同时实现精确形状对齐。我们提出一种新型的变形极坐标多边形目标检测方法(DPPD),用于检测多边形形状的物体。具体而言,我们的网络为每个目标预测一组稀疏的灵活顶点以构建多边形,每个顶点在极坐标系中用一对角度和距离表示。为实现训练,真实标注与预测的多边形均被密集重采样为具有等间距射线点的相同顶点数。该重采样操作完全可微,支持梯度反向传播。稀疏多边形预测保证了高速运行时推理,而密集重采样使网络能够高精度学习物体形状。多边形检测头建立在无锚框且无需非极大值抑制的网络架构之上。DPPD已在自动驾驶中多种目标检测任务(如交通标志、人行横道、车辆及行人目标)中成功验证。