Low-resource languages serve as invaluable repositories of human history, embodying cultural evolution and intellectual diversity. Despite their significance, these languages face critical challenges, including data scarcity and technological limitations, which hinder their comprehensive study and preservation. Recent advancements in large language models (LLMs) offer transformative opportunities for addressing these challenges, enabling innovative methodologies in linguistic, historical, and cultural research. This study systematically evaluates the applications of LLMs in low-resource language research, encompassing linguistic variation, historical documentation, cultural expressions, and literary analysis. By analyzing technical frameworks, current methodologies, and ethical considerations, this paper identifies key challenges such as data accessibility, model adaptability, and cultural sensitivity. Given the cultural, historical, and linguistic richness inherent in low-resource languages, this work emphasizes interdisciplinary collaboration and the development of customized models as promising avenues for advancing research in this domain. By underscoring the potential of integrating artificial intelligence with the humanities to preserve and study humanity's linguistic and cultural heritage, this study fosters global efforts towards safeguarding intellectual diversity.
翻译:低资源语言是人类历史的宝贵宝库,承载着文化演进与知识多样性。尽管具有重要价值,这些语言仍面临数据稀缺和技术限制等关键挑战,阻碍了其全面研究与保护。大型语言模型(LLMs)的最新进展为解决这些挑战提供了变革性机遇,为语言、历史和文化研究带来了创新方法。本研究系统评估了LLMs在低资源语言研究中的应用,涵盖语言变异、历史文献、文化表达和文学分析等领域。通过分析技术框架、现有方法论及伦理考量,本文指出了数据可及性、模型适应性和文化敏感性等核心挑战。鉴于低资源语言本身蕴含的文化、历史与语言丰富性,本研究强调跨学科合作与定制化模型开发是推进该领域研究的重要途径。通过凸显人工智能与人文科学相结合在保护和研究人类语言文化遗产方面的潜力,本工作旨在推动全球范围内保护知识多样性的共同努力。