Autonomous Vehicles (AVs) use natural images and videos as input to understand the real world by overlaying and inferring digital elements, facilitating proactive detection in an effort to assure safety. A crucial aspect of this process is real-time, accurate object recognition through automatic scene analysis. While traditional methods primarily concentrate on 2D object detection, exploring 3D object detection, which involves projecting 3D bounding boxes into the three-dimensional environment, holds significance and can be notably enhanced using the AR ecosystem. This study examines an AI model's ability to deduce 3D bounding boxes in the context of real-time scene analysis while producing and evaluating the model's performance and processing time, in the virtual domain, which is then applied to AVs. This work also employs a synthetic dataset that includes artificially generated images mimicking various environmental, lighting, and spatiotemporal states. This evaluation is oriented in handling images featuring objects in diverse weather conditions, captured with varying camera settings. These variations pose more challenging detection and recognition scenarios, which the outcomes of this work can help achieve competitive results under most of the tested conditions.
翻译:自动驾驶车辆(AVs)通过叠加和推断数字元素,以自然图像和视频作为输入来理解现实世界,从而促进主动检测以确保安全。该过程的一个关键方面是通过自动场景分析实现实时、准确的物体识别。传统方法主要集中于二维物体检测,而探索三维物体检测——涉及将三维边界框投影到三维环境中——具有重要意义,并且可以利用增强现实(AR)生态系统显著提升性能。本研究考察了人工智能模型在实时场景分析中推断三维边界框的能力,同时在虚拟领域生成并评估模型的性能和处理时间,随后将其应用于自动驾驶车辆。本工作还采用了一个合成数据集,该数据集包含模拟各种环境、光照和时空状态的人工生成图像。此评估旨在处理包含不同天气条件下物体的图像,这些图像由不同相机设置拍摄。这些变化构成了更具挑战性的检测与识别场景,而本工作的成果有助于在大多数测试条件下取得具有竞争力的结果。