Generative, pre-trained transformers (GPTs, a.k.a. "Foundation Models") have reshaped natural language processing (NLP) through their versatility in diverse downstream tasks. However, their potential extends far beyond NLP. This paper provides a software utility to help realize this potential, extending the applicability of GPTs to continuous-time sequences of complex events with internal dependencies, such as medical record datasets. Despite their potential, the adoption of foundation models in these domains has been hampered by the lack of suitable tools for model construction and evaluation. To bridge this gap, we introduce Event Stream GPT (ESGPT), an open-source library designed to streamline the end-to-end process for building GPTs for continuous-time event sequences. ESGPT allows users to (1) build flexible, foundation-model scale input datasets by specifying only a minimal configuration file, (2) leverage a Hugging Face compatible modeling API for GPTs over this modality that incorporates intra-event causal dependency structures and autoregressive generation capabilities, and (3) evaluate models via standardized processes that can assess few and even zero-shot performance of pre-trained models on user-specified fine-tuning tasks.
翻译:生成式预训练Transformer(GPT,亦称“基础模型”)因其在多样化下游任务中的通用性,重塑了自然语言处理(NLP)领域。然而,其潜力远不止于NLP。本文提供了一套软件工具,旨在将GPT的适用范围扩展至具有内部依赖关系的连续时间复杂事件序列(如医疗记录数据集)。尽管基础模型在这些领域具有巨大潜力,但缺乏合适的模型构建与评估工具阻碍了其应用。为填补这一空白,我们推出了事件流GPT(ESGPT)——一个开源库,旨在简化针对连续时间事件序列的GPT端到端构建流程。ESGPT允许用户:(1)仅通过指定最简配置文件,即可构建灵活且支持基础模型规模的数据集;(2)利用兼容Hugging Face的建模接口,为此类数据模态构建整合了事件内因果依赖结构及自回归生成能力的GPT模型;(3)通过标准化流程评估模型,可检验预训练模型在用户指定的微调任务上的少样本乃至零样本性能。