Recent advances in language models (LMs), have demonstrated significant efficacy in tasks related to the arts and humanities. While LMs have exhibited exceptional performance across a wide range of natural language processing tasks, there are notable challenges associated with their utilization on small datasets and their ability to replicate more creative human capacities. In this study, we aim to address these challenges by training a Persian classical poetry generation model using a transformer architecture on a specialized dataset with no pretraining. Additionally, we propose a novel decoding method to enhance coherence and meaningfulness in the generated poetry, effectively managing the tradeoff between diversity and quality. Furthermore, the results of our training approach and the proposed decoding method are evaluated through comprehensive set of automatic and human evaluations and showed its superior capability to generate coherent and meaningful poetry in compare to other decoding methods and an existing Persian large language model (LLM).
翻译:近期语言模型(LMs)在艺术与人文学科相关任务中展现出显著效能。尽管LMs在广泛自然语言处理任务中表现卓越,但在小数据集上的应用及对人类创造性能力的复刻仍面临显著挑战。本研究通过采用Transformer架构,在未经预训练的专业数据集上训练波斯古典诗歌生成模型,旨在应对上述挑战。同时,我们提出了一种新颖的解码方法,以增强生成诗歌的连贯性与意义性,有效平衡多样性与质量之间的权衡。此外,通过综合自动评估与人工评估,我们的训练方法及所提解码方法展现出相较于其他解码方法及现有波斯大型语言模型(LLM)更优的生成连贯且有意义诗歌的能力。