With the rapid evolution of the Internet of Things, many real-world applications utilize heterogeneously connected sensors to capture time-series information. Edge-based machine learning (ML) methodologies are often employed to analyze locally collected data. However, a fundamental issue across data-driven ML approaches is distribution shift. It occurs when a model is deployed on a data distribution different from what it was trained on, and can substantially degrade model performance. Additionally, increasingly sophisticated deep neural networks (DNNs) have been proposed to capture spatial and temporal dependencies in multi-sensor time series data, requiring intensive computational resources beyond the capacity of today's edge devices. While brain-inspired hyperdimensional computing (HDC) has been introduced as a lightweight solution for edge-based learning, existing HDCs are also vulnerable to the distribution shift challenge. In this paper, we propose DOMINO, a novel HDC learning framework addressing the distribution shift problem in noisy multi-sensor time-series data. DOMINO leverages efficient and parallel matrix operations on high-dimensional space to dynamically identify and filter out domain-variant dimensions. Our evaluation on a wide range of multi-sensor time series classification tasks shows that DOMINO achieves on average 2.04% higher accuracy than state-of-the-art (SOTA) DNN-based domain generalization techniques, and delivers 7.83x faster training and 26.94x faster inference. More importantly, DOMINO performs notably better when learning from partially labeled and highly imbalanced data, providing 10.93x higher robustness against hardware noises than SOTA DNNs.
翻译:摘要:随着物联网的快速发展,许多实际应用利用异构连接的传感器来捕获时间序列信息。基于边缘的机器学习方法常用于分析本地采集的数据。然而,数据驱动型机器学习方法的一个根本问题是分布偏移。当模型部署在与训练数据分布不同的环境中时,分布偏移会显著降低模型性能。此外,为了捕捉多传感器时间序列数据中的时空依赖关系,日益复杂的深度神经网络被提出,但其需要大量的计算资源,超出了当前边缘设备的处理能力。尽管受大脑启发的超维计算被引入作为边缘学习的轻量级解决方案,但现有超维计算方法同样面临分布偏移的挑战。本文提出DOMINO——一种新型超维计算学习框架,专门解决含噪多传感器时间序列数据中的分布偏移问题。DOMINO利用高维空间上高效的并行矩阵运算,动态识别并过滤掉域变异维度。我们在多种多传感器时间序列分类任务上的评估表明,与基于深度神经网络的最先进域泛化技术相比,DOMINO平均准确率提升2.04%,训练速度提升7.83倍,推理速度提升26.94倍。更重要的是,在处理部分标注和高度不平衡数据时,DOMINO表现尤为突出,其对硬件噪声的鲁棒性比最先进的深度神经网络高10.93倍。