Automatic unknown word detection techniques can enable new applications for assisting English as a Second Language (ESL) learners, thus improving their reading experiences. However, most modern unknown word detection methods require dedicated eye-tracking devices with high precision that are not easily accessible to end-users. In this work, we propose GazeReader, an unknown word detection method only using a webcam. GazeReader tracks the learner's gaze and then applies a transformer-based machine learning model that encodes the text information to locate the unknown word. We applied knowledge enhancement including term frequency, part of speech, and named entity recognition to improve the performance. The user study indicates that the accuracy and F1-score of our method were 98.09% and 75.73%, respectively. Lastly, we explored the design scope for ESL reading and discussed the findings.
翻译:摘要:自动生词检测技术能够为英语作为第二语言(ESL)学习者开发新型辅助应用,从而改善其阅读体验。然而,现有生词检测方法大多依赖高精度专用眼动追踪设备,普通用户难以获取。本文提出GazeReader方法,仅通过网络摄像头即可实现生词检测。该方法通过追踪学习者的视线轨迹,并应用基于Transformer的机器学习模型对文本信息进行编码,从而定位生词。我们引入了词频、词性标注和命名实体识别等知识增强技术来提升性能。用户研究表明,本方法的准确率达到98.09%,F1分数为75.73%。最后,我们探索了ESL阅读的设计空间并讨论了相关发现。