We investigate the use of sequence analysis for behavior modeling, emphasizing that sequential context often outweighs the value of aggregate features in understanding human behavior. We discuss framing common problems in fields like healthcare, finance, and e-commerce as sequence modeling tasks, and address challenges related to constructing coherent sequences from fragmented data and disentangling complex behavior patterns. We present a framework for sequence modeling using Ensembles of Hidden Markov Models, which are lightweight, interpretable, and efficient. Our ensemble-based scoring method enables robust comparison across sequences of different lengths and enhances performance in scenarios with imbalanced or scarce data. The framework scales in real-world scenarios, is compatible with downstream feature-based modeling, and is applicable in both supervised and unsupervised learning settings. We demonstrate the effectiveness of our method with results on a longitudinal human behavior dataset.
翻译:本研究探讨了序列分析在行为建模中的应用,强调在理解人类行为时,序列上下文的价值通常超过聚合特征。我们将医疗、金融和电子商务等领域的常见问题框架化为序列建模任务,并讨论了从碎片化数据构建连贯序列以及解耦复杂行为模式所面临的挑战。我们提出了一种使用隐马尔可夫模型集成进行序列建模的框架,该框架具有轻量、可解释和高效的特点。我们基于集成的评分方法能够对不同长度的序列进行稳健比较,并在数据不平衡或稀缺的场景中提升性能。该框架在实际场景中具有良好的扩展性,与下游基于特征的建模兼容,并适用于监督和无监督学习设置。我们在一个纵向人类行为数据集上的结果验证了该方法的有效性。