Deaf and hard-of-hearing (DHH) students face significant barriers in accessing science, technology, engineering, and mathematics (STEM) education, notably due to the scarcity of STEM resources in signed languages. To help address this, we introduce ASL STEM Wiki: a parallel corpus of 254 Wikipedia articles on STEM topics in English, interpreted into over 300 hours of American Sign Language (ASL). ASL STEM Wiki is the first continuous signing dataset focused on STEM, facilitating the development of AI resources for STEM education in ASL. We identify several use cases of ASL STEM Wiki with human-centered applications. For example, because this dataset highlights the frequent use of fingerspelling for technical concepts, which inhibits DHH students' ability to learn, we develop models to identify fingerspelled words -- which can later be used to query for appropriate ASL signs to suggest to interpreters.
翻译:聋哑及听力障碍学生在获取科学、技术、工程与数学教育资源时面临显著障碍,主要原因在于手语STEM资源的稀缺性。为应对这一挑战,我们推出ASL STEM Wiki:一个包含254篇维基百科STEM主题文章的平行语料库,其英语内容被翻译为超过300小时的美国手语视频。该数据集是首个专注于STEM领域的连续手语数据集,为开发面向ASL的STEM教育人工智能资源提供了基础。我们提出了ASL STEM Wiki在以人为本的应用场景中的若干用例。例如,由于该数据集突显出技术概念频繁使用指拼法(这会阻碍聋哑学生的学习能力),我们开发了识别指拼词汇的模型——这些模型未来可用于检索合适的ASL手势建议,以供手语翻译人员参考。