Environmental perception in Automated Valet Parking (AVP) has been a challenging task due to severe occlusions in parking garages. Although Collaborative Perception (CP) can be applied to broaden the field of view of connected vehicles, the limited bandwidth of vehicular communications restricts its application. In this work, we propose a BEV feature-based CP network architecture for infrastructure-assisted AVP systems. The model takes the roadside camera and LiDAR as optional inputs and adaptively fuses them with onboard sensors in a unified BEV representation. Autoencoder and downsampling are applied for channel-wise and spatial-wise dimension reduction, while sparsification and quantization further compress the feature map with little loss in data precision. Combining these techniques, the size of a BEV feature map is effectively compressed to fit in the feasible data rate of the NR-V2X network. With the synthetic AVP dataset, we observe that CP can effectively increase perception performance, especially for pedestrians. Moreover, the advantage of infrastructure-assisted CP is demonstrated in two typical safety-critical scenarios in the AVP setting, increasing the maximum safe cruising speed by up to 3m/s in both scenarios.
翻译:自动代客泊车(AVP)中的环境感知因停车场存在严重遮挡而成为一项具有挑战性的任务。尽管协同感知(CP)可扩展联网车辆视野,但车辆通信的有限带宽限制了其应用。本文针对基础设施辅助的AVP系统,提出了一种基于BEV特征的协同感知网络架构。该模型将路侧摄像头与激光雷达作为可选输入,通过统一BEV表征与车载传感器自适应融合。采用自编码器和下采样进行通道维度和空间维度的降维处理,同时通过稀疏化和量化技术进一步压缩特征图,且数据精度损失极小。结合上述技术,BEV特征图尺寸得以有效压缩,以适应NR-V2X网络的可实现数据速率。通过合成AVP数据集,我们观察到协同感知能有效提升感知性能,尤其对行人的感知效果更为显著。此外,在AVP场景下两种典型的安全关键工况中,基础设施辅助协同感知的优势得到了验证,两种场景下的最高安全巡航速度均可提升至3米/秒。