Torch-Choice: A PyTorch Package for Large-Scale Choice Modelling with Python

The $\texttt{torch-choice}$ is an open-source library for flexible, fast choice modeling with Python and PyTorch. $\texttt{torch-choice}$ provides a $\texttt{ChoiceDataset}$ data structure to manage databases flexibly and memory-efficiently. The paper demonstrates constructing a $\texttt{ChoiceDataset}$ from databases of various formats and functionalities of $\texttt{ChoiceDataset}$. The package implements two widely used models, namely the multinomial logit and nested logit models, and supports regularization during model estimation. The package incorporates the option to take advantage of GPUs for estimation, allowing it to scale to massive datasets while being computationally efficient. Models can be initialized using either R-style formula strings or Python dictionaries. We conclude with a comparison of the computational efficiencies of $\texttt{torch-choice}$ and $\texttt{mlogit}$ in R as (1) the number of observations increases, (2) the number of covariates increases, and (3) the expansion of item sets. Finally, we demonstrate the scalability of $\texttt{torch-choice}$ on large-scale datasets.

翻译：$\texttt{torch-choice}$是一个开源库，用于通过Python和PyTorch实现灵活、快速的选择建模。该库提供$\texttt{ChoiceDataset}$数据结构，以灵活且内存高效的方式管理数据库。本文演示了如何从多种格式的数据库构建$\texttt{ChoiceDataset}$，并展示了该数据结构的功能。该工具包实现了两种广泛应用的模型——多项logit模型和嵌套logit模型，并支持模型估计过程中的正则化。该工具包集成了利用GPU进行估计的选项，使其在保持计算效率的同时能够扩展至大规模数据集。模型可通过R语言风格公式字符串或Python字典进行初始化。我们最终从以下三个方面比较了$\texttt{torch-choice}$与R语言中$\texttt{mlogit}$的计算效率：(1)观测值数量增加时，(2)协变量数量增加时，(3)项目集扩展时。最后，我们在大规模数据集上展示了$\texttt{torch-choice}$的可扩展性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【2023新书】使用Python进行统计和数据可视化，554页pdf

专知会员服务

130+阅读 · 2023年1月29日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日