RUMBoost: Gradient Boosted Random Utility Models

This paper introduces the RUMBoost model, a novel discrete choice modelling approach that combines the interpretability and behavioural robustness of Random Utility Models (RUMs) with the generalisation and predictive ability of deep learning methods. We obtain the full functional form of non-linear utility specifications by replacing each linear parameter in the utility functions of a RUM with an ensemble of gradient boosted regression trees. This enables piece-wise constant utility values to be imputed for all alternatives directly from the data for any possible combination of input variables. We introduce additional constraints on the ensembles to ensure three crucial features of the utility specifications: (i) dependency of the utilities of each alternative on only the attributes of that alternative, (ii) monotonicity of marginal utilities, and (iii) an intrinsically interpretable functional form, where the exact response of the model is known throughout the entire input space. Furthermore, we introduce an optimisation-based smoothing technique that replaces the piece-wise constant utility values of alternative attributes with monotonic piece-wise cubic splines to identify non-linear parameters with defined gradient. We demonstrate the potential of the RUMBoost model compared to various ML and Random Utility benchmark models for revealed preference mode choice data from London. The results highlight the great predictive performance and the direct interpretability of our proposed approach. Furthermore, the smoothed attribute utility functions allow for the calculation of various behavioural indicators and marginal utilities. Finally, we demonstrate the flexibility of our methodology by showing how the RUMBoost model can be extended to complex model specifications, including attribute interactions, correlation within alternative error terms and heterogeneity within the population.

翻译：本文提出RUMBoost模型，这是一种新颖的离散选择建模方法，将随机效用模型的可解释性与行为稳健性，同深度学习方法的泛化能力与预测性能相结合。我们通过用梯度提升回归树集成替代随机效用模型中各线性参数，获得非线性效用规范的完整函数形式。这使得能够直接根据数据为任意输入变量组合所对应的所有备选方案推断分段常数效用值。我们在集成上引入额外约束，以确保效用规范的三个关键特征：(i) 各备选方案的效用仅依赖于该方案的属性；(ii) 边际效用的单调性；(iii) 内在可解释的函数形式，使得模型在整个输入空间内的精确响应已知。此外，我们提出一种基于优化的平滑技术，将备选方案属性的分段常数效用值替换为单调分段三次样条，以识别具有定义梯度的非线性参数。我们基于伦敦的显示偏好出行模式选择数据，展示了RUMBoost模型相较于多种机器学习与随机效用基准模型的潜力。结果凸显了所提方法的卓越预测性能与直接可解释性。此外，平滑后的属性效用函数允许计算各类行为指标与边际效用。最后，我们通过展示RUMBoost模型可扩展至复杂模型规范，包括属性交互、备选方案误差项相关性及总体异质性，证明了该方法的灵活性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日