Educational resource understanding is vital to online learning platforms, which have demonstrated growing applications recently. However, researchers and developers always struggle with using existing general natural language toolkits or domain-specific models. The issue raises a need to develop an effective and easy-to-use one that benefits AI education-related research and applications. To bridge this gap, we present a unified, modularized, and extensive library, EduNLP, focusing on educational resource understanding. In the library, we decouple the whole workflow to four key modules with consistent interfaces including data configuration, processing, model implementation, and model evaluation. We also provide a configurable pipeline to unify the data usage and model usage in standard ways, where users can customize their own needs. For the current version, we primarily provide 10 typical models from four categories, and 5 common downstream-evaluation tasks in the education domain on 8 subjects for users' usage. The project is released at: https://github.com/bigdata-ustc/EduNLP.
翻译:教育资源理解对于在线学习平台至关重要,其应用近年来日益广泛。然而,研究人员和开发者在使用现有的通用自然语言工具包或领域特定模型时常常面临困难。这一问题凸显了开发一个高效且易用的工具库的必要性,以促进人工智能教育相关的研究与应用。为弥合这一差距,我们提出了一个统一、模块化且功能丰富的库——EduNLP,专注于教育资源理解。在该库中,我们将整个工作流程解耦为四个具有统一接口的关键模块,包括数据配置、数据处理、模型实现和模型评估。我们还提供了一个可配置的流水线,以标准化方式统一数据使用和模型调用,用户可根据自身需求进行定制。在当前版本中,我们主要为用户提供了涵盖四大类别的10个典型模型,以及针对8个学科的教育领域内5种常见的下游评估任务。项目发布于:https://github.com/bigdata-ustc/EduNLP。