FishEye8K: A Benchmark and Dataset for Fisheye Camera Object Detection

Munkhjargal Gochoo,Munkh-Erdene Otgonbold,Erkhembayar Ganbold,Jun-Wei Hsieh,Ming-Ching Chang,Ping-Yang Chen,Byambaa Dorj,Hamad Al Jassmi,Ganzorig Batnasan,Fady Alnajjar,Mohammed Abduljabbar,Fang-Pang Lin

from arxiv, CVPR Workshops 2023

With the advance of AI, road object detection has been a prominent topic in computer vision, mostly using perspective cameras. Fisheye lens provides omnidirectional wide coverage for using fewer cameras to monitor road intersections, however with view distortions. To our knowledge, there is no existing open dataset prepared for traffic surveillance on fisheye cameras. This paper introduces an open FishEye8K benchmark dataset for road object detection tasks, which comprises 157K bounding boxes across five classes (Pedestrian, Bike, Car, Bus, and Truck). In addition, we present benchmark results of State-of-The-Art (SoTA) models, including variations of YOLOv5, YOLOR, YOLO7, and YOLOv8. The dataset comprises 8,000 images recorded in 22 videos using 18 fisheye cameras for traffic monitoring in Hsinchu, Taiwan, at resolutions of 1080$\times$1080 and 1280$\times$1280. The data annotation and validation process were arduous and time-consuming, due to the ultra-wide panoramic and hemispherical fisheye camera images with large distortion and numerous road participants, particularly people riding scooters. To avoid bias, frames from a particular camera were assigned to either the training or test sets, maintaining a ratio of about 70:30 for both the number of images and bounding boxes in each class. Experimental results show that YOLOv8 and YOLOR outperform on input sizes 640$\times$640 and 1280$\times$1280, respectively. The dataset will be available on GitHub with PASCAL VOC, MS COCO, and YOLO annotation formats. The FishEye8K benchmark will provide significant contributions to the fisheye video analytics and smart city applications.

翻译：随着人工智能的发展，道路目标检测已成为计算机视觉领域的重要课题，主要使用透视相机。鱼眼镜头通过减少相机数量即可实现全景覆盖以监测道路交叉口，但存在视角畸变。据我们所知，目前尚无面向鱼眼相机交通监控的公开数据集。本文提出面向道路目标检测任务的开放鱼眼8K基准数据集，包含五类目标（行人、自行车、汽车、公交车、卡车）的15.7万个边界框。此外，我们展示了包括YOLOv5、YOLOR、YOLO7和YOLOv8变体在内的最先进模型的基准测试结果。该数据集包含22段视频中提取的8000张图像，由中国台湾新竹市18台鱼眼相机以1080×1080和1280×1280分辨率采集。由于超广角全景半球形鱼眼相机图像存在严重畸变且包含大量道路参与者（尤其是骑摩托车者），数据标注与验证过程耗时耗力。为避免偏差，特定相机拍摄的帧被分配至训练集或测试集，保持各类别图像数及边界框数约70:30的比例。实验结果表明，YOLOv8和YOLOR分别在输入尺寸640×640和1280×1280时表现最优。该数据集将以PASCAL VOC、MS COCO和YOLO标注格式在GitHub上发布。鱼眼8K基准将为鱼眼视频分析与智慧城市应用提供重要贡献。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日