We introduce a novel Dual Input Stream Transformer (DIST) for the challenging problem of assigning fixation points from eye-tracking data collected during passage reading to the line of text that the reader was actually focused on. This post-processing step is crucial for analysis of the reading data due to the presence of noise in the form of vertical drift. We evaluate DIST against eleven classical approaches on a comprehensive suite of nine diverse datasets. We demonstrate that combining multiple instances of the DIST model in an ensemble achieves high accuracy across all datasets. Further combining the DIST ensemble with the best classical approach yields an average accuracy of 98.17 %. Our approach presents a significant step towards addressing the bottleneck of manual line assignment in reading research. Through extensive analysis and ablation studies, we identify key factors that contribute to DIST's success, including the incorporation of line overlap features and the use of a second input stream. Via rigorous evaluation, we demonstrate that DIST is robust to various experimental setups, making it a safe first choice for practitioners in the field.
翻译:我们提出了一种新颖的双输入流Transformer(DIST),用于解决从篇章阅读过程中收集的眼动追踪数据中,将注视点分配给读者实际关注文本行这一颇具挑战性的问题。由于存在垂直漂移形式的噪声,这一后处理步骤对于阅读数据的分析至关重要。我们在涵盖九个多样化数据集的综合测试套件上,将DIST与十一种经典方法进行了评估。我们证明,在集成中组合多个DIST实例可在所有数据集上实现高精度。进一步将DIST集成与最佳经典方法相结合,平均准确率达到98.17%。我们的方法为解决阅读研究中手动分配文本行这一瓶颈问题迈出了重要一步。通过广泛的分析和消融研究,我们确定了DIST成功的关键因素,包括行重叠特征的融入以及第二个输入流的使用。通过严格评估,我们证实DIST对各种实验设置具有鲁棒性,使其成为该领域从业者可安全首选的方法。