This study investigates the effectiveness of modern Deformable Convolutional Neural Networks (DCNNs) for semantic segmentation tasks, particularly in autonomous driving scenarios with fisheye images. These images, providing a wide field of view, pose unique challenges for extracting spatial and geometric information due to dynamic changes in object attributes. Our experiments focus on segmenting the WoodScape fisheye image dataset into ten distinct classes, assessing the Deformable Networks' ability to capture intricate spatial relationships and improve segmentation accuracy. Additionally, we explore different loss functions to address class imbalance issues and compare the performance of conventional CNN architectures with Deformable Convolution-based CNNs, including Vanilla U-Net and Residual U-Net architectures. The significant improvement in mIoU score resulting from integrating Deformable CNNs demonstrates their effectiveness in handling the geometric distortions present in fisheye imagery, exceeding the performance of traditional CNN architectures. This underscores the significant role of Deformable convolution in enhancing semantic segmentation performance for fisheye imagery.
翻译:本研究探讨了现代可变形卷积神经网络(DCNNs)在语义分割任务中的有效性,特别是在使用鱼眼图像的自动驾驶场景中。这些图像提供了宽广的视野,但由于物体属性的动态变化,给提取空间和几何信息带来了独特的挑战。我们的实验重点是将WoodScape鱼眼图像数据集分割为十个不同的类别,评估可变形网络捕获复杂空间关系和提高分割精度的能力。此外,我们探索了不同的损失函数以解决类别不平衡问题,并比较了传统CNN架构与基于可变形卷积的CNN(包括Vanilla U-Net和Residual U-Net架构)的性能。集成可变形CNN带来的mIoU分数显著提升,证明了其在处理鱼眼图像中存在的几何畸变方面的有效性,其性能超越了传统CNN架构。这凸显了可变形卷积在提升鱼眼图像语义分割性能方面的重要作用。