WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity Recognition

Though research has shown the complementarity of camera- and inertial-based data, datasets which offer both egocentric video and inertial-based sensor data remain scarce. In this paper, we introduce WEAR, an outdoor sports dataset for both vision- and inertial-based human activity recognition (HAR). The dataset comprises data from 18 participants performing a total of 18 different workout activities with untrimmed inertial (acceleration) and camera (egocentric video) data recorded at 10 different outside locations. Unlike previous egocentric datasets, WEAR provides a challenging prediction scenario marked by purposely introduced activity variations as well as an overall small information overlap across modalities. Benchmark results obtained using each modality separately show that each modality interestingly offers complementary strengths and weaknesses in their prediction performance. Further, in light of the recent success of temporal action localization models following the architecture design of the ActionFormer, we demonstrate their versatility by applying them in a plain fashion using vision, inertial and combined (vision + inertial) features as input. Results demonstrate both the applicability of vision-based temporal action localization models for inertial data and fusing both modalities by means of simple concatenation, with the combined approach (vision + inertial features) being able to produce the highest mean average precision and close-to-best F1-score. The dataset and code to reproduce experiments is publicly available via: https://mariusbock.github.io/wear/

翻译：尽管已有研究证明了摄像头与惯性数据的互补性，但同时提供自我中心视频和惯性传感器数据的数据集仍然稀缺。本文介绍了WEAR——一个面向视觉与惯性人类活动识别（HAR）的户外运动数据集。该数据集包含18名参与者在10个不同户外地点进行18种不同锻炼活动时录制的无裁剪惯性（加速度）与摄像头（自我中心视频）数据。与以往的自我中心数据集不同，WEAR提供了具有挑战性的预测场景，其特点在于有意引入的活动变异性以及各模态间整体较低的信息重叠度。单独使用各模态获得的基准测试结果显示，各模态在预测性能上呈现出有趣的互补优势与不足。此外，鉴于遵循ActionFormer架构设计的时间动作定位模型近期取得的成功，我们通过以朴素方式使用视觉、惯性及融合（视觉+惯性）特征作为输入，展示了这些模型的通用性。结果表明，基于视觉的时间动作定位模型不仅适用于惯性数据，且通过简单的特征拼接即可实现双模态融合，其中融合方法（视觉+惯性特征）能够取得最高的平均精度均值及接近最优的F1分数。数据集及复现实验的代码已通过以下链接公开获取：https://mariusbock.github.io/wear/

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日