Dreaming is a fundamental but not fully understood part of human experience that can shed light on our thought patterns. Traditional dream analysis practices, while popular and aided by over 130 unique scales and rating systems, have limitations. Mostly based on retrospective surveys or lab studies, they struggle to be applied on a large scale or to show the importance and connections between different dream themes. To overcome these issues, we developed a new, data-driven mixed-method approach for identifying topics in free-form dream reports through natural language processing. We tested this method on 44,213 dream reports from Reddit's r/Dreams subreddit, where we found 217 topics, grouped into 22 larger themes: the most extensive collection of dream topics to date. We validated our topics by comparing it to the widely-used Hall and van de Castle scale. Going beyond traditional scales, our method can find unique patterns in different dream types (like nightmares or recurring dreams), understand topic importance and connections, and observe changes in collective dream experiences over time and around major events, like the COVID-19 pandemic and the recent Russo-Ukrainian war. We envision that the applications of our method will provide valuable insights into the intricate nature of dreaming.
翻译:梦境是人类体验的基本组成部分,但其机制尚未完全被理解,能够揭示我们的思维模式。传统的梦境分析方法虽然广受欢迎,并借助超过130种独特的量表和评分系统,但存在局限性。这些方法主要基于回顾性调查或实验室研究,难以大规模应用,也难以展示不同梦境主题之间的重要性和关联。为了克服这些问题,我们开发了一种新的、数据驱动的混合方法,通过自然语言处理从自由形式的梦境报告中识别主题。我们在Reddit的r/Dreams子版块中的44,213份梦境报告上测试了该方法,发现了217个主题,这些主题被归入22个更大的主题类别——这是迄今为止最广泛的梦境主题集合。我们通过将其与广泛使用的Hall和van de Castle量表进行比较,验证了我们的主题。超越传统量表,我们的方法能够发现不同类型梦境(如噩梦或重复梦境)中的独特模式,理解主题的重要性和关联,并观察集体梦境体验随时间推移以及围绕重大事件(如COVID-19疫情和近期的俄乌战争)的变化。我们设想,该方法的实际应用将为理解梦境复杂本质提供宝贵见解。