The objective of augmented reality (AR) is to add digital content to natural images and videos to create an interactive experience between the user and the environment. Scene analysis and object recognition play a crucial role in AR, as they must be performed quickly and accurately. In this study, a new approach is proposed that involves using oriented bounding boxes with a detection and recognition deep network to improve performance and processing time. The approach is evaluated using two datasets: a real image dataset (DOTA dataset) commonly used for computer vision tasks, and a synthetic dataset that simulates different environmental, lighting, and acquisition conditions. The focus of the evaluation is on small objects, which are difficult to detect and recognise. The results indicate that the proposed approach tends to produce better Average Precision and greater accuracy for small objects in most of the tested conditions.
翻译:增强现实(AR)的目标是将数字内容添加到自然图像和视频中,以创建用户与环境之间的交互体验。场景分析和目标识别在AR中起着关键作用,因为这些过程需要快速且准确地进行。本研究提出了一种新方法,即使用带有定向边界框的检测与识别深度网络,以提高性能并缩短处理时间。该方法使用两个数据集进行评估:一个常用于计算机视觉任务的真实图像数据集(DOTA数据集),以及一个模拟不同环境、光照和采集条件的合成数据集。评估重点聚焦于难以检测和识别的小目标。结果表明,在大多数测试条件下,所提方法倾向于对小目标产生更高的平均精度和准确率。