In this paper, we create YORC: a new multi-choice Yoruba Reading Comprehension dataset that is based on Yoruba high-school reading comprehension examination. We provide baseline results by performing cross-lingual transfer using existing English RACE dataset based on a pre-trained encoder-only model. Additionally, we provide results by prompting large language models (LLMs) like GPT-4.
翻译:本文构建了YORC:一个基于约鲁巴语高中阅读理解考试的新型多选题阅读理解数据集。我们利用基于预训练编码器模型的现有英文RACE数据集进行跨语言迁移,提供了基线实验结果。此外,我们还通过提示大语言模型(如GPT-4)获取了相应结果。