Object skeletons offer a concise representation of structural information, capturing essential aspects of posture and orientation that are crucial for autonomous driving applications. However, a unified architecture that simultaneously handles multiple instances and categories using only the input image remains elusive. In this paper, we introduce PoseDriver, a unified framework for bottom-up multi-category skeleton detection tailored to common objects in driving scenarios. We model each category as a distinct task to systematically address the challenges of multi-task learning. Specifically, we propose a novel approach for lane detection based on skeleton representations, achieving state-of-the-art performance on the OpenLane dataset. Moreover, we present a new dataset for bicycle skeleton detection and assess the transferability of our framework to novel categories. Experimental results validate the effectiveness of the proposed approach.
翻译:目标骨架提供了一种简洁的结构信息表示方式,能够捕捉姿态与朝向等关键方面,这对自动驾驶应用至关重要。然而,目前仍缺乏一种仅通过输入图像即可同时处理多实例、多类别的统一架构。本文提出PoseDriver,一种面向自动驾驶场景常见物体、自底向上的多类别骨架检测统一框架。我们将每个类别建模为独立任务,系统性地解决多任务学习中的挑战。具体而言,我们提出了一种基于骨架表示的车道线检测新方法,在OpenLane数据集上取得了最先进的性能。此外,我们发布了一个新的自行车骨架检测数据集,并评估了所提框架向新类别的迁移能力。实验结果验证了该方法的有效性。