This document provides the annotation guidelines for MaiBaam, a Bavarian corpus manually annotated with part-of-speech (POS) tags, syntactic dependencies, and German lemmas. MaiBaam belongs to the Universal Dependencies (UD) project, and our annotations elaborate on the general and German UD version 2 guidelines. In this document, we detail how to preprocess and tokenize Bavarian data, provide an overview of the POS tags and dependencies we use, explain annotation decisions that would also apply to closely related languages like German, and lastly we introduce and motivate decisions that are specific to Bavarian grammar.
翻译:本文档提供了MaiBaam的标注指南。MaiBaam是一个巴伐利亚语语料库,已人工标注了词性标签、句法依存关系和德语词元。MaiBaam属于通用依存关系项目,我们的标注细化了通用及德语UD版本2指南。在本文档中,我们详细说明了如何对巴伐利亚语数据进行预处理和分词,概述了我们使用的词性标签和依存关系,解释了同样适用于德语等密切相关语言的标注决策,最后介绍并说明了针对巴伐利亚语语法的特定决策。