Recent advances in language models (LMs), have demonstrated significant efficacy in tasks related to the arts and humanities. While LMs have exhibited exceptional performance across a wide range of natural language processing tasks, there are notable challenges associated with their utilization on small datasets and their ability to replicate more creative human capacities. In this study, we aim to address these challenges by training a Persian classical poetry generation model using a transformer architecture on a specialized dataset with no pretraining. Additionally, we propose a novel decoding method to enhance coherence and meaningfulness in the generated poetry, effectively managing the tradeoff between diversity and quality. Furthermore, the results of our training approach and the proposed decoding method are evaluated through comprehensive set of automatic and human evaluations and showed its superior capability to generate coherent and meaningful poetry in compare to other decoding methods and an existing Persian large language model (LLM).
翻译:近期语言模型的进展在艺术与人文学科相关任务中展现出显著效能。尽管语言模型在各类自然语言处理任务中表现卓越,但其在小数据集上的应用及复现人类创造性能力方面仍面临显著挑战。本研究旨在通过基于Transformer架构、在无预训练的专业数据集上训练波斯古典诗歌生成模型来应对这些挑战。此外,我们提出一种新型解码方法,以增强生成诗歌的连贯性与意义性,有效管理多样性与质量之间的权衡。最后,通过全面的自动评估与人工评估,我们训练方法与所提解码方法的结果表明,相较于其他解码方法及现有波斯大型语言模型,该方法在生成连贯且富有意义的诗歌方面更具优越性。