TwinLiteNet+: An Enhanced Multi-Task Segmentation Model for Autonomous Driving

Semantic segmentation is a fundamental perception task in autonomous driving, particularly for identifying drivable areas and lane markings to enable safe navigation. However, most state-of-the-art (SOTA) models are computationally intensive and unsuitable for real-time deployment on resource-constrained embedded devices. In this paper, we introduce TwinLiteNet+, an enhanced multi-task segmentation model designed for real-time drivable area and lane segmentation with high efficiency. TwinLiteNet+ employs a hybrid encoder architecture that integrates stride-based dilated convolutions and depthwise separable dilated convolutions, balancing representational capacity and computational cost. To improve task-specific decoding, we propose two lightweight upsampling modules-Upper Convolution Block (UCB) and Upper Simple Block (USB)-alongside a Partial Class Activation Attention (PCAA) mechanism that enhances segmentation precision. The model is available in four configurations, ranging from the ultra-compact TwinLiteNet+_{Nano} (34K parameters) to the high-performance TwinLiteNet+_{Large} (1.94M parameters). On the BDD100K dataset, TwinLiteNet+_{Large} achieves 92.9% mIoU for drivable area segmentation and 34.2% IoU for lane segmentation-surpassing existing state-of-the-art models while requiring 11x fewer floating-point operations (FLOPs) for computation. Extensive evaluations on embedded devices demonstrate superior inference speed, quantization robustness (INT8/FP16), and energy efficiency, validating TwinLiteNet+ as a compelling solution for real-world autonomous driving systems. Code is available at https://github.com/chequanghuy/TwinLiteNetPlus.

翻译：语义分割是自动驾驶中的基础感知任务，尤其对于识别可行驶区域和车道标线以实现安全导航至关重要。然而，大多数最先进模型计算量巨大，不适合在资源受限的嵌入式设备上实时部署。本文提出TwinLiteNet+，一种增强型多任务分割模型，专为高效实时分割可行驶区域和车道而设计。TwinLiteNet+采用混合编码器架构，集成基于步长的扩张卷积和深度可分离扩张卷积，在表征能力与计算成本间取得平衡。为改进任务特定解码，我们提出两种轻量化上采样模块——上部卷积块和上部简单块，并引入局部类别激活注意力机制以提升分割精度。该模型提供四种配置，从超紧凑型TwinLiteNet+_{Nano}（34K参数）到高性能TwinLiteNet+_{Large}（1.94M参数）。在BDD100K数据集上，TwinLiteNet+_{Large}的可行驶区域分割mIoU达92.9%，车道分割IoU达34.2%，超越现有最先进模型，同时计算所需浮点运算次数减少11倍。在嵌入式设备上的大量评估表明，该模型具备卓越的推理速度、量化鲁棒性（INT8/FP16）及能效，验证了TwinLiteNet+作为真实世界自动驾驶系统解决方案的可行性。代码开源：https://github.com/chequanghuy/TwinLiteNetPlus。