This survey delves into the current state of natural language processing (NLP) for four Ethiopian languages: Amharic, Afaan Oromo, Tigrinya, and Wolaytta. Through this paper, we identify key challenges and opportunities for NLP research in Ethiopia. Furthermore, we provide a centralized repository on GitHub that contains publicly available resources for various NLP tasks in these languages. This repository can be updated periodically with contributions from other researchers. Our objective is to identify research gaps and disseminate the information to NLP researchers interested in Ethiopian languages and encourage future research in this domain.
翻译:本综述深入探讨了四种埃塞俄比亚语言(阿姆哈拉语、阿法安奥罗莫语、提格雷尼亚语和沃莱塔语)的自然语言处理(NLP)现状。通过本文,我们识别了埃塞俄比亚NLP研究面临的主要挑战与机遇。此外,我们在GitHub上建立了一个集中式资源库,收录了面向这些语言各类NLP任务的公开可用资源。该资源库可通过其他研究者的贡献定期更新。我们的目标在于识别研究空白,并向关注埃塞俄比亚语言的NLP研究社区传播相关信息,从而推动该领域的未来研究。