Indian poetry, known for its linguistic complexity and deep cultural resonance, has a rich and varied heritage spanning thousands of years. However, its layered meanings, cultural allusions, and sophisticated grammatical constructions often pose challenges for comprehension, especially for non-native speakers or readers unfamiliar with its context and language. Despite its cultural significance, existing works on poetry have largely overlooked Indian language poems. In this paper, we propose the Translation and Image Generation (TAI) framework, leveraging Large Language Models (LLMs) and Latent Diffusion Models through appropriate prompt tuning. Our framework supports the United Nations Sustainable Development Goals of Quality Education (SDG 4) and Reduced Inequalities (SDG 10) by enhancing the accessibility of culturally rich Indian-language poetry to a global audience. It includes (1) a translation module that uses an Odds Ratio Preference Alignment Algorithm to accurately translate morphologically rich poetry into English, and (2) an image generation module that employs a semantic graph to capture tokens, dependencies, and semantic relationships between metaphors and their meanings, to create visually meaningful representations of Indian poems. Our comprehensive experimental evaluation, including both human and quantitative assessments, demonstrates the superiority of TAI Diffusion in poem image generation tasks, outperforming strong baselines. To further address the scarcity of resources for Indian-language poetry, we introduce the Morphologically Rich Indian Language Poems MorphoVerse Dataset, comprising 1,570 poems across 21 low-resource Indian languages. By addressing the gap in poetry translation and visual comprehension, this work aims to broaden accessibility and enrich the reader's experience.
翻译:印度诗歌以其语言复杂性和深厚的文化共鸣而闻名,拥有跨越数千年的丰富多样遗产。然而,其多层次含义、文化典故及复杂的语法结构常给理解带来挑战,尤其对非母语者或不熟悉其语境与语言的读者而言。尽管具有重要文化意义,现有诗歌研究在很大程度上忽视了印度语言诗歌。本文提出翻译与图像生成(TAI)框架,通过适当的提示调优,利用大型语言模型(LLMs)和潜在扩散模型。该框架通过增强文化丰富的印度语言诗歌对全球受众的可及性,支持联合国可持续发展目标中的优质教育(SDG 4)和减少不平等(SDG 10)。它包括:(1)采用优势比偏好对齐算法的翻译模块,将形态丰富的诗歌准确翻译为英语;(2)利用语义图捕捉隐喻及其含义之间的词元、依存关系和语义关联的图像生成模块,以创建印度诗歌的视觉化意义表征。我们通过包含人工与量化评估的综合实验验证,证明了TAI Diffusion在诗歌图像生成任务中的优越性,其性能超越强基线模型。为应对印度语言诗歌资源稀缺问题,我们引入了形态丰富的印度语言诗歌MorphoVerse数据集,涵盖21种低资源印度语言的1,570首诗歌。通过填补诗歌翻译与视觉理解领域的空白,本研究旨在拓宽可及性并丰富读者的体验。