Recent years have brought great advances into solving morphological tasks, mostly due to powerful neural models applied to various tasks as (re)inflection and analysis. Yet, such morphological tasks cannot be considered solved, especially when little training data is available or when generalizing to previously unseen lemmas. This work explores effects on performance obtained through various ways in which morphological models get access to subcharacter phonological features that are the targets of morphological processes. We design two methods to achieve this goal: one that leaves models as is but manipulates the data to include features instead of characters, and another that manipulates models to take phonological features into account when building representations for phonemes. We elicit phonemic data from standard graphemic data using language-specific grammars for languages with shallow grapheme-to-phoneme mapping, and we experiment with two reinflection models over eight languages. Our results show that our methods yield comparable results to the grapheme-based baseline overall, with minor improvements in some of the languages. All in all, we conclude that patterns in character distributions are likely to allow models to infer the underlying phonological characteristics, even when phonemes are not explicitly represented.
翻译:近年来,由于强大的神经模型被应用于(再)变化和分析等各种形态学任务,形态学问题的解决取得了巨大进展。然而,此类形态学任务尚不能视为已解决,尤其是在训练数据稀少或需泛化至未见过词元的情况下。本研究探讨了形态学模型通过多种方式获取子字符级音系特征(即形态过程的目标)对性能产生的影响。我们设计两种方法实现目标:其一保持模型不变但调整数据以包含特征而非字符;其二调整模型,使构建音位表征时考虑音系特征。我们利用浅层字形-音位映射语言的语言特定语法,从标准字形数据中提取音位数据,并在八种语言上实验两种再变化模型。结果表明,我们的方法整体上与基于字形的基线结果相当,部分语言有微小改进。总体而言,我们推断字符分布模式可能使模型推断出潜在的音系特征,即便音位未被显式表征。