We introduce a novel Dual Input Stream Transformer (DIST) for the challenging problem of assigning fixation points from eye-tracking data collected during passage reading to the line of text that the reader was actually focused on. This post-processing step is crucial for analysis of the reading data due to the presence of noise in the form of vertical drift. We evaluate DIST against nine classical approaches on a comprehensive suite of nine diverse datasets, and demonstrate DIST's superiority. By combining multiple instances of the DIST model in an ensemble we achieve an average accuracy of 98.5\% across all datasets. Our approach presents a significant step towards addressing the bottleneck of manual line assignment in reading research. Through extensive model analysis and ablation studies, we identify key factors that contribute to DIST's success, including the incorporation of line overlap features and the use of a second input stream. Through evaluation on a set of diverse datasets we demonstrate that DIST is robust to various experimental setups, making it a safe first choice for practitioners in the field.
翻译:我们提出了一种新颖的双输入流Transformer(DIST),用于解决篇章阅读过程中采集的眼动追踪数据中注视点与实际阅读文本行的分配难题。由于存在垂直漂移噪声,这一后处理步骤对阅读数据的分析至关重要。我们在九个不同数据集的综合套件上,将DIST与九种经典方法进行了对比评估,证明了DIST的优越性。通过集成多个DIST模型实例,我们在所有数据集上实现了平均98.5%的准确率。我们的方法为突破阅读研究中人工行分配的瓶颈迈出了重要一步。通过广泛的模型分析与消融研究,我们确定了DIST成功的关键因素,包括行重叠特征的引入及第二输入流的使用。在多样化数据集上的评估表明,DIST对不同的实验设置具有鲁棒性,使其成为该领域从业者的安全首选。