Analysis and extraction of useful information from legal judgments using computational linguistics was one of the earliest problems posed in the domain of information retrieval. Presently, several commercial vendors exist who automate such tasks. However, a crucial bottleneck arises in the form of exorbitant pricing and lack of resources available in analysis of judgements mete out by Hong Kong's Legal System. This paper attempts to bridge this gap by providing several statistical, machine learning, deep learning and zero-shot learning based methods to effectively analyze legal judgments from Hong Kong's Court System. The methods proposed consists of: (1) Citation Network Graph Generation, (2) PageRank Algorithm, (3) Keyword Analysis and Summarization, (4) Sentiment Polarity, and (5) Paragrah Classification, in order to be able to extract key insights from individual as well a group of judgments together. This would make the overall analysis of judgments in Hong Kong less tedious and more automated in order to extract insights quickly using fast inferencing. We also provide an analysis of our results by benchmarking our results using Large Language Models making robust use of the HuggingFace ecosystem.
翻译:利用计算语言学对法律判决进行分析和信息提取,是信息检索领域最早提出的问题之一。目前已有多个商业供应商实现了此类任务的自动化。然而,香港法律体系判决分析中存在价格过高和资源匮乏的关键瓶颈。本文尝试通过提供基于统计、机器学习、深度学习和零样本学习的方法,有效分析香港法院系统的法律判决,以弥合这一差距。所提出的方法包括:(1)引文网络图生成、(2)PageRank算法、(3)关键词分析与摘要、(4)情感极性分析以及(5)段落分类,从而能够从单个及多组判决中提取关键见解。这将使香港法律判决的整体分析减少繁琐,提升自动化程度,从而通过快速推理迅速获取洞察。我们还通过利用HuggingFace生态系统的大型语言模型对结果进行基准测试,提供了对分析结果的评估。