Heart rate prediction is vital for personalized health monitoring and fitness, while it frequently faces a critical challenge when deploying in real-world: data heterogeneity. We classify it in two key dimensions: source heterogeneity from fragmented device markets with varying feature sets, and user heterogeneity reflecting distinct physiological patterns across individuals and activities. Existing methods either discard device-specific information, or fail to model user-specific differences, limiting their real-world performance. To address this, we propose a framework that learns latent representations agnostic to both heterogeneity, enabling downstream predictors to work consistently under heterogeneous data patterns. Specifically, we introduce a random feature dropout strategy to handle source heterogeneity, making the model robust to various feature sets. To manage user heterogeneity, we employ a time-aware attention module to capture long-term physiological traits and use a contrastive learning objective to build a discriminative representation space. To reflect the heterogeneous nature of real-world data, we created and publicly released a new benchmark dataset, ParroTao. Evaluations on both ParroTao and the public FitRec dataset show that our model significantly outperforms existing baselines by 17% and 15%, respectively. Furthermore, analysis of the learned representations demonstrates their strong discriminative power, and one downstream application task confirm the practical value of our model.
翻译:心率预测对于个性化健康监测与健身至关重要,但在实际部署中常面临一个关键挑战:数据异构性。我们将其归纳为两个关键维度:源于碎片化设备市场、具有不同特征集的来源异构性,以及反映个体与活动间不同生理模式的用户异构性。现有方法要么丢弃设备特定信息,要么未能建模用户特定差异,限制了其实际性能。为解决这一问题,我们提出一个框架,学习对两种异构性均保持无关的潜在表征,使下游预测器能在异构数据模式下稳定工作。具体而言,我们引入随机特征丢弃策略以处理来源异构性,使模型对各类特征集具有鲁棒性;为处理用户异构性,我们使用时序感知注意力模块捕获长期生理特征,并采用对比学习目标构建判别性表征空间。为反映真实世界数据的异构特性,我们创建并公开发布了新的基准数据集ParroTao。在ParroTao及公开数据集FitRec上的评估表明,我们的模型分别以17%和15%的幅度显著优于现有基线方法。此外,对所学表征的分析证明了其强大的判别能力,一项下游应用任务也证实了我们模型的实用价值。