This tutorial paper provides a step-by-step workflow for building and analysing semantic networks from short creative texts. We introduce and compare two widely used text-to-network approaches: word co-occurrence networks and textual forma mentis networks (TFMNs). We also demonstrate how they can be used in machine learning to predict human creativity ratings. Using a corpus of 1029 short stories, we guide readers through text preprocessing, network construction, feature extraction (structural measures, spreading-activation indices, and emotion scores), and application of regression models. We evaluate how network-construction choices influence both network topology and predictive performance. Across all modelling settings, TFMNs consistently outperformed co-occurrence networks through lower prediction errors (best MAE = 0.581 for TFMN, vs 0.592 for co-occurrence with window size 3). Network-structural features dominated predictive performance (MAE = 0.591 for TFMN), whereas emotion features performed worse (MAE = 0.711 for TFMN) and spreading-activation measures contributed little (MAE = 0.788 for TFMN). This paper offers practical guidance for researchers interested in applying network-based methods for cognitive fields like creativity research. we show when syntactic networks are preferable to surface co-occurrence models, and provide an open, reproducible workflow accessible to newcomers in the field, while also offering deeper methodological insight for experienced researchers.
翻译:本教程论文提供了一个从短篇创意文本构建和分析语义网络的逐步工作流程。我们介绍并比较了两种广泛使用的文本到网络方法:词语共现网络和文本心智形式网络(TFMN)。我们还展示了如何将它们应用于机器学习中,以预测人类创造力评分。利用一个包含1029篇短篇故事的语料库,我们引导读者完成文本预处理、网络构建、特征提取(结构度量、扩散激活指数和情感分数)以及回归模型的应用。我们评估了网络构建选择如何影响网络拓扑和预测性能。在所有建模设置中,TFMN始终优于共现网络,具有更低的预测误差(TFMN的最佳MAE = 0.581,而窗口大小为3的共现网络为0.592)。网络结构特征主导了预测性能(TFMN的MAE = 0.591),而情感特征表现较差(TFMN的MAE = 0.711),扩散激活度量贡献甚微(TFMN的MAE = 0.788)。本文为有兴趣在创造力研究等认知领域应用基于网络方法的研究人员提供了实用指导。我们展示了句法网络何时优于表面共现模型,并提供了一个开放、可复现的工作流程,便于该领域的新手使用,同时也为经验丰富的研究人员提供了更深入的方法学见解。