Synthetic data generation has been a growing area of research in recent years. However, its potential applications in serious games have not been thoroughly explored. Advances in this field could anticipate data modelling and analysis, as well as speed up the development process. The COVID-19 pandemic has enlarged such a phenomenon, To try to fill this gap in the literature, we propose a simulator architecture for generating probabilistic synthetic data for serious games based on interactive narratives. This architecture is designed to be generic and modular so that it can be used by other researchers on similar problems. To simulate the interaction of synthetic players with questions, we use a cognitive testing model based on the Item Response Theory framework. We also show how probabilistic graphical models (in particular Bayesian networks) can be used to introduce expert knowledge and external data into the simulation. Finally, we apply the proposed architecture and methods in a use case of a serious game focused on cyberbullying. We perform Bayesian inference experiments using a hierarchical model to demonstrate the identifiability and robustness of the generated data.
翻译:近年来,合成数据生成已成为一个不断发展的研究领域。然而,其在严肃游戏中的潜在应用尚未得到充分探索。该领域的进展可以预判数据建模与分析,并加速开发流程。新冠疫情加剧了这一现象。为弥补现有文献中的这一空白,我们提出了一种基于交互叙事生成严肃游戏概率合成数据的模拟器架构。该架构设计为通用且模块化的形式,以便其他研究者将其应用于类似问题。为模拟合成玩家对问题的交互行为,我们采用了基于项目反应理论框架的认知测试模型。同时,我们展示了如何利用概率图模型(特别是贝叶斯网络)将专家知识与外部数据引入模拟过程。最后,我们将所提出的架构与方法应用于一个以网络霸凌为核心的严肃游戏案例中,并通过层级模型进行贝叶斯推断实验,以验证生成数据的可辨识性与鲁棒性。