This work presents KoBigBird-large, a large size of Korean BigBird that achieves state-of-the-art performance and allows long sequence processing for Korean language understanding. Without further pretraining, we only transform the architecture and extend the positional encoding with our proposed Tapered Absolute Positional Encoding Representations (TAPER). In experiments, KoBigBird-large shows state-of-the-art overall performance on Korean language understanding benchmarks and the best performance on document classification and question answering tasks for longer sequences against the competitive baseline models. We publicly release our model here.
翻译:[translated abstract in Chinese]
本工作提出了KoBigBird-large,一种大型韩语BigBird模型,在韩语理解任务中实现了最先进的性能,并支持长序列处理。无需额外预训练,我们仅通过改造架构并采用所提出的锥形绝对位置编码表示(TAPER)扩展位置编码,即完成模型构建。实验表明,KoBigBird-large在韩语理解基准测试中展现出整体最先进的性能,并在长序列文档分类与问答任务中相较于竞争基线模型取得了最佳表现。我们已公开释出该模型。