Object skeletons offer a concise representation of structural information, capturing essential aspects of posture and orientation that are crucial for autonomous driving applications. However, a unified architecture that simultaneously handles multiple instances and categories using only the input image remains elusive. In this paper, we introduce PoseDriver, a unified framework for bottom-up multi-category skeleton detection tailored to common objects in driving scenarios. We model each category as a distinct task to systematically address the challenges of multi-task learning. Specifically, we propose a novel approach for lane detection based on skeleton representations, achieving state-of-the-art performance on the OpenLane dataset. Moreover, we present a new dataset for bicycle skeleton detection and assess the transferability of our framework to novel categories. Experimental results validate the effectiveness of the proposed approach.
翻译:目标骨架提供了结构信息的简洁表达,捕捉了姿态与朝向等关键要素,这对自动驾驶应用至关重要。然而,目前仍缺乏一种仅利用输入图像即可同时处理多实例与多类别的统一架构。本文提出PoseDriver,一种面向驾驶场景中常见物体的自底向上多类别骨架检测统一框架。我们将每个类别建模为独立任务,以系统性地解决多任务学习中的挑战。具体而言,我们基于骨架表示提出了一种创新的车道检测方法,在OpenLane数据集上取得了最优性能。此外,我们发布了新的自行车骨架检测数据集,并评估了本框架向新类别的迁移能力。实验结果验证了所提方法的有效性。