In crowdsourced user experiments that collect performance data from graphical user interface (GUI) interactions, some participants ignore instructions or act carelessly, threatening the validity of performance models. We investigate a pre-task screening method that requires simple GUI operations analogous to the main task and uses the resulting error as a continuous quality signal. Our pre-task is a brief image-resizing task in which workers match an on-screen card to a physical card; workers whose resizing error exceeds a threshold are excluded from the main experiment. The main task is a standardized pointing experiment with well-established models of movement time and error rate. Across mouse- and smartphone-based crowdsourced experiments, we show that reducing the proportion of workers exhibiting unexpected behavior and tightening the pre-task threshold systematically improve the goodness of fit and predictive accuracy of GUI performance models, demonstrating that brief pre-task screening can enhance data quality.
翻译:在收集图形用户界面交互性能数据的众包用户实验中,部分参与者忽视操作说明或行为草率,这对性能模型的有效性构成威胁。本研究探讨一种任务前筛选方法:要求参与者完成与主任务类似的简单GUI操作,并将产生的误差作为连续质量指标。我们的预筛选任务为简短的图像尺寸调整任务——要求工作者将屏幕卡片与实体卡片进行匹配;若尺寸调整误差超过设定阈值,则将该工作者排除在主实验之外。主实验采用标准化的指向任务,其运动时间与错误率模型已获学界广泛验证。通过基于鼠标与智能手机的众包实验,我们发现:降低异常行为工作者比例并收紧预筛选任务阈值,能系统提升GUI性能模型的拟合优度与预测精度。这表明简短的任务前筛选能有效提升数据质量。