We explore the potential of pixel-based models for transfer learning from standard languages to dialects. These models convert text into images that are divided into patches, enabling a continuous vocabulary representation that proves especially useful for out-of-vocabulary words common in dialectal data. Using German as a case study, we compare the performance of pixel-based models to token-based models across various syntactic and semantic tasks. Our results show that pixel-based models outperform token-based models in part-of-speech tagging, dependency parsing and intent detection for zero-shot dialect evaluation by up to 26 percentage points in some scenarios, though not in Standard German. However, pixel-based models fall short in topic classification. These findings emphasize the potential of pixel-based models for handling dialectal data, though further research should be conducted to assess their effectiveness in various linguistic contexts.
翻译:本研究探讨了基于像素的模型在从标准语言到方言的迁移学习中的潜力。这些模型将文本转换为图像并分割为图像块,从而实现了连续的词汇表示,这对于方言数据中常见的词汇表外词汇尤为有效。以德语为例,我们在多种句法和语义任务中比较了基于像素的模型与基于分词模型的性能。实验结果表明,在零样本方言评估中,基于像素的模型在词性标注、依存句法分析和意图检测任务上的表现优于基于分词的模型,在某些场景下优势可达26个百分点,但在标准德语任务中未观察到类似优势。然而,基于像素的模型在主题分类任务中表现欠佳。这些发现凸显了基于像素的模型在处理方言数据方面的潜力,但仍需进一步研究以评估其在多样化语言环境中的有效性。