The next generation of machine learning systems must be adept at perceiving and interacting with the physical world through a diverse array of sensory channels. Commonly referred to as the `Internet of Things (IoT)' ecosystem, sensory data from motion, thermal, geolocation, depth, wireless signals, video, and audio are increasingly used to model the states of physical environments and the humans inside them. Despite the potential for understanding human wellbeing, controlling physical devices, and interconnecting smart cities, the community has seen limited benchmarks for building machine learning systems for IoT. Existing efforts are often specialized to a single sensory modality or prediction task, which makes it difficult to study and train large-scale models across many IoT sensors and tasks. To accelerate the development of new machine learning technologies for IoT, this paper proposes MultiIoT, the most expansive and unified IoT benchmark to date, encompassing over 1.15 million samples from 12 modalities and 8 real-world tasks. MultiIoT introduces unique challenges involving (1) generalizable learning from many sensory modalities, (2) multimodal interactions across long temporal ranges, (3) extreme heterogeneity due to unique structure and noise topologies in real-world sensors, and (4) complexity during training and inference. We evaluate a comprehensive set of models on MultiIoT, including modality and task-specific methods, multisensory and multitask supervised models, and large multisensory foundation models. Our results highlight opportunities for ML to make a significant impact in IoT, but many challenges in scalable learning from heterogeneous, long-range, and imperfect sensory modalities still persist. We release all code and data to accelerate future research in machine learning for IoT.
翻译:下一代机器学习系统必须能够通过多样化的感知通道熟练地感知物理世界并与之交互。通常被称为“物联网(IoT)”生态系统,来自运动、热感、地理定位、深度、无线信号、视频和音频的传感数据正日益用于建模物理环境及其内部人类的状态。尽管在理解人类福祉、控制物理设备以及互联智慧城市方面具有潜力,但社区内针对物联网构建机器学习系统的基准测试仍然有限。现有工作通常专精于单一感知模态或预测任务,这使得跨多种物联网传感器和任务进行大规模模型的研究与训练变得困难。为加速物联网新机器学习技术的发展,本文提出了MultiIoT,这是迄今为止最全面、最统一的物联网基准测试,涵盖12种模态和8个现实世界任务中的超过115万个样本。MultiIoT引入了独特的挑战,涉及:(1)从多种感知模态中进行可泛化学习,(2)跨长时间范围的多模态交互,(3)由于现实世界传感器独特的结构和噪声拓扑导致的极端异构性,以及(4)训练和推理过程中的复杂性。我们在MultiIoT上评估了一系列全面的模型,包括模态与任务专用方法、多感知与多任务监督模型,以及大型多感知基础模型。我们的结果凸显了机器学习在物联网领域产生重大影响的机遇,但从异构、长程且不完美的感知模态中进行可扩展学习仍面临诸多挑战。我们公开所有代码和数据,以加速未来物联网机器学习的研究。