The widespread adoption of generative AI (GenAI) has introduced new challenges in crowdsourced data collection, particularly in survey-based research. While GenAI offers powerful capabilities, its unintended use in crowdsourcing, such as generating automated survey responses, threatens the integrity of empirical research and complicates efforts to understand public opinion and behavior. In this study, we investigate and evaluate two approaches for detecting AI-generated responses in online surveys: LLM-based detection and signature-based detection. We conducted experiments across seven survey studies, comparing responses collected before 2022 with those collected after the release of ChatGPT. Our findings reveal a significant increase in AI-generated responses in the post-2022 studies, highlighting how GenAI may silently distort crowdsourced data. This work raises broader concerns about evolving landscape of data integrity, where GenAI can compromise data quality, mislead researchers, and influence downstream findings in fields such as health, politics, and social behavior. By surfacing detection strategies and empirical evidence of GenAI's impact, we aim to contribute to ongoing conversation about safeguarding research integrity and supporting scholars navigating these methodological and ethical challenges.
翻译:生成式人工智能(GenAI)的广泛采用为众包数据收集带来了新的挑战,尤其是在基于调查的研究中。尽管GenAI提供了强大的能力,但其在众包中的非预期使用(例如自动生成调查回复)威胁到实证研究的完整性,并使理解公众意见和行为的努力复杂化。在本研究中,我们调查并评估了两种检测在线调查中AI生成回复的方法:基于大语言模型(LLM)的检测和基于签名的检测。我们在七项调查研究中进行了实验,比较了2022年之前收集的回复与ChatGPT发布后收集的回复。我们的研究结果显示,在2022年后的研究中,AI生成的回复显著增加,凸显了GenAI如何可能悄无声息地扭曲众包数据。这项工作引发了关于数据完整性不断演变格局的更广泛担忧,其中GenAI可能损害数据质量、误导研究人员,并影响健康、政治和社会行为等领域的研究结果。通过揭示检测策略和GenAI影响的实证证据,我们旨在为关于保护研究完整性和支持学者应对这些方法论与伦理挑战的持续讨论做出贡献。