Data-driven approaches have revolutionized scientific research. Machine learning and statistical analysis are commonly utilized in this type of research. Despite their widespread use, these methodologies differ significantly in their techniques and objectives. Few studies have utilized a consistent dataset to demonstrate these differences within the social sciences, particularly in language and cognitive sciences. This study leverages the Buckeye Speech Corpus to illustrate how both machine learning and statistical analysis are applied in data-driven research to obtain distinct insights. This study significantly enhances our understanding of the diverse approaches employed in data-driven strategies.
翻译:数据驱动方法已彻底改变了科学研究。在这一研究类型中,机器学习与统计分析被普遍运用。尽管这些方法论应用广泛,但其技术手段与研究目标却存在显著差异。目前鲜有研究采用一致的数据集来阐明社会科学——尤其是语言与认知科学——中这两者的区别。本研究利用巴克语音语料库,通过实例展示机器学习与统计分析如何分别应用于数据驱动研究以获取不同层面的洞察。该研究显著深化了我们对数据驱动策略中多元方法的理解。