The proliferation of mobile devices has led to the collection of large amounts of population data. This situation has prompted the need to utilize this rich, multidimensional data in practical applications. In response to this trend, we have integrated functional data analysis (FDA) and factor analysis to address the challenge of predicting hourly population changes across various districts in Tokyo. Specifically, by assuming a Gaussian process, we avoided the large covariance matrix parameters of the multivariate normal distribution. In addition, the data were both time and spatially dependent between districts. To capture these characteristics, a Bayesian factor model was introduced, which modeled the time series of a small number of common factors and expressed the spatial structure through factor loading matrices. Furthermore, the factor loading matrices were made identifiable and sparse to ensure the interpretability of the model. We also proposed a Bayesian shrinkage method as a systematic approach for factor selection. Through numerical experiments and data analysis, we investigated the predictive accuracy and interpretability of our proposed method. We concluded that the flexibility of the method allows for the incorporation of additional time series features, thereby improving its accuracy.
翻译:移动设备的普及导致了大量人口数据的收集。这一情况促使我们需要在实际应用中利用这些丰富、多维度的数据。针对这一趋势,我们整合了函数数据分析(FDA)与因子分析,以解决东京各区域每小时人口变化的预测难题。具体而言,通过假设高斯过程,我们避免了多元正态分布中的大型协方差矩阵参数。此外,数据在区域间同时具有时间依赖性和空间依赖性。为捕捉这些特征,我们引入了一个贝叶斯因子模型,该模型对少量共同因子的时间序列进行建模,并通过因子载荷矩阵表达空间结构。同时,我们使因子载荷矩阵具有可识别性和稀疏性,以确保模型的可解释性。我们还提出了一种贝叶斯收缩方法,作为因子选择的系统性方案。通过数值实验和数据分析,我们验证了所提方法的预测准确性和可解释性。我们得出结论,该方法的灵活性允许纳入额外的时间序列特征,从而提升其精确度。