In this paper, we present PARAMANU-AYN, a language model based exclusively on case documents of the Supreme Court of India, the Constitution of India, and the Indian Penal Code. The novel Auto Regressive (AR) decoder based model is pretrained from scratch at a context size of 8192. We evaluated our pretrained legal model on perplexity metrics. We also instruction-tuned our pretrained model on a set of 10,763 instructions covering various legal tasks such as legal reasoning, judgement explanation, legal clause generation, legal drafting, legal contract drafting, case summarization, constitutional question-answering, etc. We also evaluated the responses of prompts for instruction-tuned models by GPT-3.5-Turbo on clarity, relevance, completeness, and legal reasoning metrics in a scale of 10. Our model can be run on CPU and achieved 42.46 tokens/sec CPU inference speed. We found that our models, despite not being pretrained on legal books, various legal contracts, and legal documents, were able to learn the domain knowledge required for drafting various legal contracts and legal clauses, and generalize to draft legal contracts and legal clauses with limited instruction tuning. Hence, we conclude that for a strong domain-specialized generative language model (such as legal), very large amounts of data are not required to develop models from scratch. We believe that this work is the first attempt to make a dedicated generative legal language model from scratch for Indian Supreme Court jurisdiction or in legal NLP overall. We plan to release our Paramanu-Ayn model at https://www.bharatgpts.com.
翻译:本文提出PARAMANU-AYN,一种完全基于印度最高法院案例文书、印度宪法及印度刑法典的语言模型。该新型自回归(AR)解码器模型在8192上下文窗口下从零开始预训练。我们通过困惑度指标评估了预训练法律模型的表现。此外,我们基于10,763条指令对预训练模型进行指令微调,涵盖法律推理、判决解释、法律条款生成、法律起草、合同草拟、案例摘要、宪法问答等多种法律任务。随后,我们采用GPT-3.5-Turbo从清晰度、相关性、完整性和法律推理维度(满分10分)对指令微调模型的提示响应进行评估。该模型可在CPU上运行,推理速度达42.46 tokens/秒。研究发现,尽管未在法律书籍、各类法律合同及法律文书上进行预训练,模型仍能习得起草各类法律合同与条款所需的领域知识,并通过有限指令微调泛化至法律合同与条款起草任务。由此得出结论:对于强领域专精的生成式语言模型(如法律领域),从零构建模型并不需要海量数据。我们认为,本工作是针对印度最高法院司法管辖范围乃至法律自然语言处理领域,首次尝试从零构建专用生成式法律语言模型。我们计划在https://www.bharatgpts.com 发布Paramanu-Ayn模型。