Mathematical Entities: Corpora and Benchmarks

Mathematics is a highly specialized domain with its own unique set of challenges. Despite this, there has been relatively little research on natural language processing for mathematical texts, and there are few mathematical language resources aimed at NLP. In this paper, we aim to provide annotated corpora that can be used to study the language of mathematics in different contexts, ranging from fundamental concepts found in textbooks to advanced research mathematics. We preprocess the corpora with a neural parsing model and some manual intervention to provide part-of-speech tags, lemmas, and dependency trees. In total, we provide 182397 sentences across three corpora. We then aim to test and evaluate several noteworthy natural language processing models using these corpora, to show how well they can adapt to the domain of mathematics and provide useful tools for exploring mathematical language. We evaluate several neural and symbolic models against benchmarks that we extract from the corpus metadata to show that terminology extraction and definition extraction do not easily generalize to mathematics, and that additional work is needed to achieve good performance on these metrics. Finally, we provide a learning assistant that grants access to the content of these corpora in a context-sensitive manner, utilizing text search and entity linking. Though our corpora and benchmarks provide useful metrics for evaluating mathematical language processing, further work is necessary to adapt models to mathematics in order to provide more effective learning assistants and apply NLP methods to different mathematical domains.

翻译：数学是一个高度专业化的领域，具有其独特的挑战。尽管如此，针对数学文本的自然语言处理研究相对较少，且面向自然语言处理的数学语言资源稀缺。本文旨在提供带标注的语料库，用于研究不同语境下的数学语言，范围涵盖从教科书中的基础概念到高级研究数学。我们通过神经解析模型和部分人工干预对语料库进行预处理，提供词性标注、词元及依存树。总计在三个语料库中提供182,397个句子。随后，我们利用这些语料库测试和评估若干值得关注的自然语言处理模型，以展示它们适应数学领域的程度，并为探索数学语言提供实用工具。我们基于从语料库元数据中提取的基准测试，评估了多种神经与符号模型，结果表明术语抽取和定义抽取任务难以直接推广至数学领域，需要额外工作才能在这些指标上取得良好性能。最后，我们开发了一个学习助手，通过文本搜索和实体链接以语境敏感的方式提供对这些语料库内容的访问。尽管我们的语料库和基准测试为评估数学语言处理提供了有用指标，仍需进一步工作使模型适应数学领域，以提供更有效的学习助手并将自然语言处理方法应用于不同的数学领域。