A Benchmark Grocery Dataset of Realworld Point Clouds From Single View

Fine-grained grocery object recognition is an important computer vision problem with broad applications in automatic checkout, in-store robotic navigation, and assistive technologies for the visually impaired. Existing datasets on groceries are mainly 2D images. Models trained on these datasets are limited to learning features from the regular 2D grids. While portable 3D sensors such as Kinect were commonly available for mobile phones, sensors such as LiDAR and TrueDepth, have recently been integrated into mobile phones. Despite the availability of mobile 3D sensors, there are currently no dedicated real-world large-scale benchmark 3D datasets for grocery. In addition, existing 3D datasets lack fine-grained grocery categories and have limited training samples. Furthermore, collecting data by going around the object versus the traditional photo capture makes data collection cumbersome. Thus, we introduce a large-scale grocery dataset called 3DGrocery100. It constitutes 100 classes, with a total of 87,898 3D point clouds created from 10,755 RGB-D single-view images. We benchmark our dataset on six recent state-of-the-art 3D point cloud classification models. Additionally, we also benchmark the dataset on few-shot and continual learning point cloud classification tasks. Project Page: https://bigdatavision.org/3DGrocery100/.

翻译：细粒度杂货物体识别是计算机视觉领域的重要问题，在自动结账、店内机器人导航以及视觉障碍辅助技术中具有广泛应用。现有杂货数据集主要为二维图像，基于这些数据集训练的模型仅能从常规二维网格中学习特征。尽管Kinect等便携式三维传感器曾广泛用于手机，但LiDAR和TrueDepth等传感器近期才被集成到移动设备中。然而，尽管移动三维传感器已普及，目前仍缺乏面向杂货领域的专用真实大规模基准三维数据集。此外，现有三维数据集缺少细粒度杂货类别，且训练样本有限。同时，环绕物体采集数据的方式相比传统拍照采集更为繁琐。为此，我们提出名为3DGrocery100的大规模杂货数据集。该数据集包含100个类别，总计由10,755张RGB-D单视角图像生成的87,898个三维点云。我们基于六种最新三维点云分类模型对该数据集进行基准测试，同时还在小样本学习和持续学习点云分类任务上进行了评估。项目页面：https://bigdatavision.org/3DGrocery100/。

相关内容

点云

关注 50

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日