Large Language Models (LLMs), like LLaMA, have exhibited remarkable performances across various tasks. Nevertheless, when deployed to specific domains such as law or medicine, the models still confront the challenge of a deficiency in domain-specific knowledge and an inadequate capability to leverage that knowledge to resolve domain-related problems. In this paper, we focus on the legal domain and explore how to inject domain knowledge during the continual training stage and how to design proper supervised finetune tasks to help the model tackle practical issues. Moreover, to alleviate the hallucination problem during model's generation, we add a retrieval module and extract relevant articles before the model answers any queries. Augmenting with the extracted evidence, our model could generate more reliable responses. We release our data and model at https://github.com/AndrewZhe/lawyer-llama.
翻译:大语言模型(LLM),如LLaMA,已在各种任务中展现出卓越的性能。然而,当部署到法律或医学等特定领域时,这些模型仍面临领域知识不足以及无法有效利用该知识解决领域相关问题的挑战。本文聚焦于法律领域,探讨如何在持续训练阶段注入领域知识,以及如何设计合适的监督微调任务以帮助模型处理实际问题。此外,为缓解模型生成过程中的幻觉问题,我们添加了一个检索模块,在模型回答任何查询前提取相关法条。结合提取的证据,我们的模型能够生成更可靠的响应。我们已在https://github.com/AndrewZhe/lawyer-llama 上发布数据和模型。