Understanding and predicting judicial outcomes demands nuanced analysis of legal documents. Traditional approaches treat judgments and proceedings as unstructured text, limiting the effectiveness of large language models (LLMs) in tasks such as summarization, argument generation, and judgment prediction. We propose LexChronos, an agentic framework that iteratively extracts structured event timelines from Supreme Court of India judgments. LexChronos employs a dual-agent architecture: a LoRA-instruct-tuned extraction agent identifies candidate events, while a pre-trained feedback agent scores and refines them through a confidence-driven loop. To address the scarcity of Indian legal event datasets, we construct a synthetic corpus of 2000 samples using reverse-engineering techniques with DeepSeek-R1 and GPT-4, generating gold-standard event annotations. Our pipeline achieves a BERT-based F1 score of 0.8751 against this synthetic ground truth. In downstream evaluations on legal text summarization, GPT-4 preferred structured timelines over unstructured baselines in 75% of cases, demonstrating improved comprehension and reasoning in Indian jurisprudence. This work lays a foundation for future legal AI applications in the Indian context, such as precedent mapping, argument synthesis, and predictive judgment modelling, by harnessing structured representations of legal events.
翻译:理解和预测司法判决结果需要对法律文件进行细致入微的分析。传统方法将判决书和诉讼程序视为非结构化文本,限制了大型语言模型在摘要生成、论点构建和判决预测等任务中的有效性。我们提出了LexChronos,这是一个智能体框架,能够从印度最高法院的判决书中迭代提取结构化的事件时间线。LexChronos采用双智能体架构:一个经过LoRA指令微调的提取智能体负责识别候选事件,而一个预训练的反馈智能体则通过置信度驱动的循环对这些事件进行评分和精炼。针对印度法律事件数据集稀缺的问题,我们利用DeepSeek-R1和GPT-4,通过逆向工程技术构建了一个包含2000个样本的合成语料库,并生成了黄金标准的事件标注。我们的流水线在此合成基准真值上取得了基于BERT的F1分数0.8751。在下游的法律文本摘要评估中,GPT-4在75%的案例中更倾向于选择结构化时间线而非非结构化基线,这表明其在印度司法领域的理解和推理能力得到了提升。通过利用法律事件的结构化表示,这项工作为未来印度语境下的法律人工智能应用(如判例映射、论点综合和预测性判决建模)奠定了基础。