Low-resource languages serve as invaluable repositories of human history, embodying cultural evolution and intellectual diversity. Despite their significance, these languages face critical challenges, including data scarcity and technological limitations, which hinder their comprehensive study and preservation. Recent advancements in large language models (LLMs) offer transformative opportunities for addressing these challenges, enabling innovative methodologies in linguistic, historical, and cultural research. This study systematically evaluates the applications of LLMs in low-resource language research, encompassing linguistic variation, historical documentation, cultural expressions, and literary analysis. By analyzing technical frameworks, current methodologies, and ethical considerations, this paper identifies key challenges such as data accessibility, model adaptability, and cultural sensitivity. Given the cultural, historical, and linguistic richness inherent in low-resource languages, this work emphasizes interdisciplinary collaboration and the development of customized models as promising avenues for advancing research in this domain. By underscoring the potential of integrating artificial intelligence with the humanities to preserve and study humanity's linguistic and cultural heritage, this study fosters global efforts towards safeguarding intellectual diversity.
翻译:低资源语言是人类历史的宝贵宝库,承载着文化演进与知识多样性。尽管其意义重大,这些语言仍面临数据稀缺与技术局限等关键挑战,阻碍了其全面研究与保护。大型语言模型(LLMs)的最新进展为解决这些挑战提供了变革性机遇,为语言学、历史学及文化研究领域带来了创新方法论。本研究系统评估了LLMs在低资源语言研究中的应用,涵盖语言变异、历史文献、文化表达与文学分析等多个维度。通过分析技术框架、现有方法及伦理考量,本文揭示了数据可及性、模型适应性及文化敏感性等核心挑战。鉴于低资源语言蕴含的文化、历史与语言丰富性,本工作强调跨学科合作与定制化模型开发是推动该领域研究的重要路径。通过凸显人工智能与人文科学融合在保护和研究人类语言文化遗产方面的潜力,本研究旨在促进全球范围内保护知识多样性的共同努力。