The features in many prediction models naturally take the form of a hierarchy. The lower levels represent individuals or events. These units group naturally into locations and intervals or other aggregates, often at multiple levels. Levels of groupings may intersect and join, much as relational database tables do. Besides representing the structure of the data, predictive features in hierarchical models can be assigned to their proper levels. Such models lend themselves to hierarchical Bayes solution methods that ``share'' results of inference between groups by generalizing over the case of individual models for each group versus one model that aggregates all groups into one. In this paper we show our work-in-progress applying a hierarchical Bayesian model to forecast purchases throughout the day at store franchises, with groupings over locations and days of the week. We demonstrate using the \textsf{stan} package on individual sales transaction data collected over the course of a year. We show how this solves the dilemma of having limited data and hence modest accuracy for each day and location, while being able to scale to a large number of locations with improved accuracy.
翻译:在许多预测模型中,特征自然呈现层次结构。较低层次代表个体或事件,这些单元自然地按地点、时间区间或其他聚合方式分组,且通常涉及多个层级。分组层次之间可以交叉和连接,类似于关系数据库表。除了表示数据结构外,分层模型中预测性特征可被分配至适当的层级。这类模型适用于分层贝叶斯求解方法,该方法通过对各组分别建模与将所有组合并为单一模型这两种极端情况进行泛化,从而在组间“共享”推断结果。本文展示了我们正在进行的研究:应用分层贝叶斯模型预测连锁店在全天各时段的销售情况,其中分组依据为地点和星期。我们演示了如何使用\textsf{stan}软件包处理一年间收集的个体销售交易数据。结果表明,该方法解决了每个地点和每一天数据有限导致预测精度不高的问题,同时能扩展至大量地点并提升预测精度。