The Swa-bhasha Resource Hub provides a comprehensive collection of data resources and algorithms developed for Romanized Sinhala to Sinhala transliteration between 2020 and 2025. These resources have played a significant role in advancing research in Sinhala Natural Language Processing (NLP), particularly in training transliteration models and developing applications involving Romanized Sinhala. The current openly accessible data sets and corresponding tools are made publicly available through this hub. This paper presents a detailed overview of the resources contributed by the authors and includes a comparative analysis of existing transliteration applications in the domain.
翻译:Swa-bhasha资源中心提供了2020年至2025年间为从罗马化僧伽罗语到僧伽罗语转写所开发的数据资源与算法综合集。这些资源在推动僧伽罗语自然语言处理(NLP)研究方面发挥了重要作用,尤其是在训练转写模型以及开发涉及罗马化僧伽罗语的应用中。目前公开可用的数据集及相应工具均通过该中心向公众开放。本文详细概述了作者贡献的资源,并包含了对该领域现有转写应用的比较分析。