MVImgNet2.0: A Larger-scale Dataset of Multi-view Images

MVImgNet is a large-scale dataset that contains multi-view images of ~220k real-world objects in 238 classes. As a counterpart of ImageNet, it introduces 3D visual signals via multi-view shooting, making a soft bridge between 2D and 3D vision. This paper constructs the MVImgNet2.0 dataset that expands MVImgNet into a total of ~520k objects and 515 categories, which derives a 3D dataset with a larger scale that is more comparable to ones in the 2D domain. In addition to the expanded dataset scale and category range, MVImgNet2.0 is of a higher quality than MVImgNet owing to four new features: (i) most shoots capture 360-degree views of the objects, which can support the learning of object reconstruction with completeness; (ii) the segmentation manner is advanced to produce foreground object masks of higher accuracy; (iii) a more powerful structure-from-motion method is adopted to derive the camera pose for each frame of a lower estimation error; (iv) higher-quality dense point clouds are reconstructed via advanced methods for objects captured in 360-degree views, which can serve for downstream applications. Extensive experiments confirm the value of the proposed MVImgNet2.0 in boosting the performance of large 3D reconstruction models. MVImgNet2.0 will be public at luyues.github.io/mvimgnet2, including multi-view images of all 520k objects, the reconstructed high-quality point clouds, and data annotation codes, hoping to inspire the broader vision community.

翻译：MVImgNet 是一个大规模数据集，包含约 22 万个现实世界物体在 238 个类别下的多视角图像。作为 ImageNet 的对应物，它通过多视角拍摄引入了 3D 视觉信号，在 2D 与 3D 视觉之间架起了一座软桥梁。本文构建了 MVImgNet2.0 数据集，将 MVImgNet 扩展至总计约 52 万个物体和 515 个类别，由此衍生出一个规模更大、更可与 2D 领域数据集相媲美的 3D 数据集。除了扩展的数据集规模和类别范围外，MVImgNet2.0 凭借四项新特性，其质量高于 MVImgNet：（i）大多数拍摄捕获了物体的 360 度视图，可支持具有完整性的物体重建学习；（ii）分割方式得到改进，能生成精度更高的前景物体掩码；（iii）采用更强大的运动恢复结构方法，为每帧图像估计出误差更低的相机位姿；（iv）通过先进方法为 360 度视图捕获的物体重建了更高质量的密集点云，可用于下游应用。大量实验证实了所提出的 MVImgNet2.0 在提升大型 3D 重建模型性能方面的价值。MVImgNet2.0 将在 luyues.github.io/mvimgnet2 公开，包括所有 52 万个物体的多视角图像、重建的高质量点云以及数据标注代码，以期启发更广泛的视觉社区。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日