Automated Essay Scoring (AES) has gained increasing attention in recent years, yet research on Arabic AES remains limited due to the lack of publicly available datasets. To address this, we introduce LAILA, the largest publicly available Arabic AES dataset to date, comprising 7,859 essays annotated with holistic and trait-specific scores on seven dimensions: relevance, organization, vocabulary, style, development, mechanics, and grammar. We detail the dataset design, collection, and annotations, and provide benchmark results using state-of-the-art Arabic and English models in prompt-specific and cross-prompt settings. LAILA fills a critical need in Arabic AES research, supporting the development of robust scoring systems.
翻译:自动作文评分近年来日益受到关注,但由于缺乏公开可用的数据集,针对阿拉伯语的自动作文评分研究仍然有限。为此,我们推出了LAILA,这是迄今为止最大的公开阿拉伯语自动作文评分数据集,包含7,859篇作文,并在七个维度上标注了整体分数和分项特征分数:相关性、组织结构、词汇、文体、内容展开、格式规范与语法。我们详细阐述了数据集的设计、收集与标注过程,并提供了在特定题目和跨题目设置下使用最先进的阿拉伯语及英语模型所获得的基准测试结果。LAILA填补了阿拉伯语自动作文评分研究领域的关键空白,为开发鲁棒的评分系统提供了支持。