Semantic segmentation is a fundamental perception task in autonomous driving, particularly for identifying drivable areas and lane markings to enable safe navigation. However, most state-of-the-art (SOTA) models are computationally intensive and unsuitable for real-time deployment on resource-constrained embedded devices. In this paper, we introduce TwinLiteNet+, an enhanced multi-task segmentation model designed for real-time drivable area and lane segmentation with high efficiency. TwinLiteNet+ employs a hybrid encoder architecture that integrates stride-based dilated convolutions and depthwise separable dilated convolutions, balancing representational capacity and computational cost. To improve task-specific decoding, we propose two lightweight upsampling modules-Upper Convolution Block (UCB) and Upper Simple Block (USB)-alongside a Partial Class Activation Attention (PCAA) mechanism that enhances segmentation precision. The model is available in four configurations, ranging from the ultra-compact TwinLiteNet+_{Nano} (34K parameters) to the high-performance TwinLiteNet+_{Large} (1.94M parameters). On the BDD100K dataset, TwinLiteNet+_{Large} achieves 92.9% mIoU for drivable area segmentation and 34.2% IoU for lane segmentation-surpassing existing state-of-the-art models while requiring 11x fewer floating-point operations (FLOPs) for computation. Extensive evaluations on embedded devices demonstrate superior inference speed, quantization robustness (INT8/FP16), and energy efficiency, validating TwinLiteNet+ as a compelling solution for real-world autonomous driving systems. Code is available at https://github.com/chequanghuy/TwinLiteNetPlus.
翻译:语义分割是自动驾驶中的基础感知任务,尤其对于识别可行驶区域和车道标线以实现安全导航至关重要。然而,大多数最先进模型计算量巨大,不适合在资源受限的嵌入式设备上实时部署。本文提出TwinLiteNet+,一种增强型多任务分割模型,专为高效实时分割可行驶区域和车道而设计。TwinLiteNet+采用混合编码器架构,集成基于步长的扩张卷积和深度可分离扩张卷积,在表征能力与计算成本间取得平衡。为改进任务特定解码,我们提出两种轻量化上采样模块——上部卷积块和上部简单块,并引入局部类别激活注意力机制以提升分割精度。该模型提供四种配置,从超紧凑型TwinLiteNet+_{Nano}(34K参数)到高性能TwinLiteNet+_{Large}(1.94M参数)。在BDD100K数据集上,TwinLiteNet+_{Large}的可行驶区域分割mIoU达92.9%,车道分割IoU达34.2%,超越现有最先进模型,同时计算所需浮点运算次数减少11倍。在嵌入式设备上的大量评估表明,该模型具备卓越的推理速度、量化鲁棒性(INT8/FP16)及能效,验证了TwinLiteNet+作为真实世界自动驾驶系统解决方案的可行性。代码开源:https://github.com/chequanghuy/TwinLiteNetPlus。