Large Language Models (LLMs) and Multimodal Large language models (MLLMs) have taken the world by storm with impressive abilities in complex reasoning and linguistic comprehension. Meanwhile there are plethora of works related to Vietnamese Large Language Models, the lack of high-quality resources in multimodality limits the progress of Vietnamese MLLMs. In this paper, we pioneer in address this by introducing LaVy, a state-of-the-art Vietnamese MLLM, and we also introduce LaVy-Bench benchmark designated for evaluating MLLMs's understanding on Vietnamese visual language tasks. Our project is public at https://github.com/baochi0212/LaVy
翻译:大语言模型(LLMs)和多模态大语言模型(MLLMs)凭借其在复杂推理与语言理解方面的卓越能力,已在全球范围内引发广泛关注。尽管当前已有大量针对越南语大语言模型的研究工作,但高质量多模态资源的匮乏限制了越南语MLLMs的发展进程。本文通过推出LaVy——一个先进的越南语多模态大语言模型,率先应对这一挑战;同时我们发布了专为评估MLLMs在越南语视觉语言任务理解能力而设计的LaVy-Bench基准测试集。本项目已在https://github.com/baochi0212/LaVy 公开。