Bird's Eye View (BEV) representations are tremendously useful for perception-related automated driving tasks. However, generating BEVs from surround-view fisheye camera images is challenging due to the strong distortions introduced by such wide-angle lenses. We take the first step in addressing this challenge and introduce a baseline, F2BEV, to generate discretized BEV height maps and BEV semantic segmentation maps from fisheye images. F2BEV consists of a distortion-aware spatial cross attention module for querying and consolidating spatial information from fisheye image features in a transformer-style architecture followed by a task-specific head. We evaluate single-task and multi-task variants of F2BEV on our synthetic FB-SSEM dataset, all of which generate better BEV height and segmentation maps (in terms of the IoU) than a state-of-the-art BEV generation method operating on undistorted fisheye images. We also demonstrate discretized height map generation from real-world fisheye images using F2BEV. Our dataset is publicly available at https://github.com/volvo-cars/FB-SSEM-dataset
翻译:鸟瞰图(BEV)表征在感知相关的自动驾驶任务中极为有用。然而,由于广角镜头引入的强烈畸变,从环视鱼眼相机图像生成鸟瞰图颇具挑战性。我们率先迈出解决这一难题的第一步,并引入基线方法F2BEV,用于从鱼眼图像生成离散化的鸟瞰高度图和鸟瞰语义分割图。F2BEV包含一个畸变感知空间交叉注意力模块,采用Transformer风格架构从鱼眼图像特征中查询并整合空间信息,随后连接任务特定头部。我们在合成的FB-SSEM数据集上评估了F2BEV的单任务与多任务变体,所有变体生成的鸟瞰高度图和分割图(以IoU衡量)均优于基于去畸变鱼眼图像的最先进鸟瞰图生成方法。我们还展示了利用F2BEV从真实世界鱼眼图像生成离散化高度图的结果。我们的数据集已公开于https://github.com/volvo-cars/FB-SSEM-dataset。