The intersection of AI and legal systems presents a growing need for tools that support legal education, particularly in under-resourced languages such as Romanian. In this work, we aim to evaluate the capabilities of Large Language Models (LLMs) and Vision-Language Models (VLMs) in understanding and reasoning about the Romanian driving law through textual and visual question-answering tasks. To facilitate this, we introduce RoD-TAL, a novel multimodal dataset comprising Romanian driving test questions, text-based and image-based, along with annotated legal references and explanations written by human experts. We implement and assess retrieval-augmented generation (RAG) pipelines, dense retrievers, and reasoning-optimized models across tasks, including Information Retrieval (IR), Question Answering (QA), Visual IR, and Visual QA. Our experiments demonstrate that domain-specific fine-tuning significantly enhances retrieval performance. At the same time, chain-of-thought prompting and specialized reasoning models improve QA accuracy, surpassing the minimum passing grades required for driving exams. We highlight the potential and limitations of applying LLMs and VLMs to legal education. We release the code and resources through the GitHub repository.
翻译:人工智能与法律系统的交叉领域对支持法律教育的工具需求日益增长,尤其是在罗马尼亚语等资源匮乏的语言中。本研究旨在通过文本和视觉问答任务,评估大语言模型(LLMs)和视觉语言模型(VLMs)在理解和推理罗马尼亚驾驶法规方面的能力。为此,我们提出了RoD-TAL——一个新颖的多模态数据集,包含罗马尼亚驾照考试中的文本与图像类问题,以及由人类专家标注的法律依据与解释。我们针对信息检索(IR)、问答(QA)、视觉信息检索(Visual IR)和视觉问答(Visual QA)等任务,实现并评估了检索增强生成(RAG)流程、密集检索器及推理优化模型。实验表明,领域特定的微调显著提升了检索性能,同时思维链提示和专用推理模型提高了问答准确率,超过了驾照考试的最低通过标准。我们指出了将LLMs和VLMs应用于法律教育的潜力与局限。相关代码和资源已通过GitHub仓库开源。