Pre-training representations acquired via self-supervised learning could achieve high accuracy on even tasks with small training data. Unlike in vision and natural language processing domains, pre-training for IMU-based applications is challenging, as there are few public datasets with sufficient size and diversity to learn generalizable representations. To overcome this problem, we propose IMG2IMU that adapts pre-trained representation from large-scale images to diverse IMU sensing tasks. We convert the sensor data into visually interpretable spectrograms for the model to utilize the knowledge gained from vision. We further present a sensor-aware pre-training method for images that enables models to acquire particularly impactful knowledge for IMU sensing applications. This involves using contrastive learning on our augmentation set customized for the properties of sensor data. Our evaluation with four different IMU sensing tasks shows that IMG2IMU outperforms the baselines pre-trained on sensor data by an average of 9.6%p F1-score, illustrating that vision knowledge can be usefully incorporated into IMU sensing applications where only limited training data is available.
翻译:通过自监督学习预训练的表示方法即使在小规模训练数据任务上也能实现高精度。与视觉和自然语言处理领域不同,基于IMU应用的预训练面临挑战,因为缺乏具有足够规模和多样性的公开数据集来学习通用表示。为解决这一问题,我们提出IMG2IMU方法,将大规模图像预训练表示迁移至多种IMU传感任务。我们将传感器数据转换为视觉可解释的频谱图,使模型能够利用从视觉领域获得的知识。我们进一步提出一种面向图像的传感器感知预训练方法,使模型能够获取对IMU传感应用具有特殊影响力的知识。该方法采用基于传感器数据特性定制的增强集进行对比学习。在四种不同IMU传感任务上的评估表明,IMG2IMU相比基于传感器数据预训练的基线方法平均F1分数提升9.6个百分点,证明视觉知识可有效融入训练数据有限的IMU传感应用。