Questionnaires in the behavioral sciences tend to be lengthy. However, literature suggests that survey length is a contributing factor to careless responding, with longer questionnaires yielding higher probability that participants start responding carelessly. Consequently, in long surveys a large number of participants may engage in careless responding, posing a major threat to internal validity. We propose a novel method for identifying the onset of careless responding (or an absence thereof) that searches for a changepoint in combined measurements of multiple dimensions in which carelessness may manifest, such as inconsistency and invariability. It is highly flexible, based on machine learning, and provides statistical guarantees for controlling the false positive rate. In simulation experiments, the proposed method achieves high accuracy in identifying carelessness onset and discriminates well between attentive and various types of careless responding, even when a large number of careless respondents are present. An empirical application highlights how identifying partial carelessness uncovers novel insights on careless responding behavior. Furthermore, we provide the freely available open source software package "carelessonset" to facilitate adoption by empirical researchers.
翻译:行为科学中的问卷往往篇幅冗长。然而,文献表明,调查长度是导致随意应答的一个因素,问卷越长,参与者开始随意应答的概率越高。因此,在长问卷调查中,大量参与者可能进行随意应答,这对内部效度构成重大威胁。我们提出了一种识别随意应答起始点(或其缺失)的新方法,该方法通过搜索多个维度组合测量中的变点来检测随意性可能显现的迹象,如不一致性和不变性。该方法基于机器学习,具有高度灵活性,并为控制假阳性率提供了统计保证。在模拟实验中,所提出的方法在识别随意性起始点方面达到了高精度,并能很好地区分认真应答与各类随意应答,即使在存在大量随意应答者的情况下也是如此。一项实证应用展示了识别部分随意性如何揭示关于随意应答行为的新见解。此外,我们提供了免费开源软件包“carelessonset”,以方便实证研究者采用。