Preparing exact and comprehensive word meaning explanations is one of the key steps in the process of monolingual dictionary writing. In standard methodology, the explanations need an expert lexicographer who spends a substantial amount of time checking the consistency between the descriptive text and corpus evidence. In the following text, we present a new tool that derives explanations automatically based on collective information from very large corpora, particularly on word sketches. We also propose a quantitative evaluation of the constructed explanations, concentrating on explanations of nouns. The methodology is to a certain extent language independent; however, the presented verification is limited to Czech and English. We show that the presented approach allows to create explanations that contain data useful for understanding the word meaning in approximately 90% of cases. However, in many cases, the result requires post-editing to remove redundant information.
翻译:编写精确且全面的词义解释是单语词典编纂过程中的关键步骤之一。按照标准方法,这些解释需要专业词典编纂者花费大量时间检查描述性文本与语料库证据之间的一致性。在以下内容中,我们介绍了一种新工具,该工具基于超大语料库中的集体信息(特别是词语素描)自动生成解释。我们还对所构建的解释进行了定量评估,重点关注名词的解释。该方法在某种程度上是语言独立的;然而,所呈现的验证仅限于捷克语和英语。我们表明,所提出的方法能够在大约90%的案例中生成包含有助于理解词义的数据的解释。但在许多情况下,结果需要后期编辑以去除冗余信息。