Judicial reasoning in copyright damage awards poses a core challenge for computational legal analysis. Although federal courts follow the 1976 Copyright Act, their interpretations and factor weightings vary widely across jurisdictions. This inconsistency creates unpredictability for litigants and obscures the empirical basis of legal decisions. This research introduces a novel discourse-based Large Language Model (LLM) methodology that integrates Rhetorical Structure Theory (RST) with an agentic workflow to extract and quantify previously opaque reasoning patterns from judicial opinions. Our framework addresses a major gap in empirical legal scholarship by parsing opinions into hierarchical discourse structures and using a three-stage pipeline, i.e., Dataset Construction, Discourse Analysis, and Agentic Feature Extraction. This pipeline identifies reasoning components and extract feature labels with corresponding discourse subtrees. In analyzing copyright damage rulings, we show that discourse-augmented LLM analysis outperforms traditional methods while uncovering unquantified variations in factor weighting across circuits. These findings offer both methodological advances in computational legal analysis and practical insights into judicial reasoning, with implications for legal practitioners seeking predictive tools, scholars studying legal principle application, and policymakers confronting inconsistencies in copyright law.
翻译:版权损害赔偿判决中的司法推理是计算法律分析面临的核心挑战。尽管联邦法院遵循1976年《版权法》,但不同司法管辖区对法条的解释和因素权重分配存在显著差异。这种不一致性导致诉讼结果难以预测,并掩盖了法律判决的实证基础。本研究提出一种基于话语分析的新型大语言模型方法,将修辞结构理论与智能体工作流相结合,从司法意见中提取并量化先前不透明的推理模式。我们的框架通过将司法意见解析为层级化话语结构,并采用三阶段流程(即数据集构建、话语分析和智能体特征提取),填补了实证法律研究的重要空白。该流程能识别推理构成要素,并通过对应的话语子树提取特征标签。在对版权损害赔偿判决的分析中,我们证明增强话语分析的LLM方法优于传统方法,同时揭示了各巡回法院在因素权重分配中未被量化的差异。这些发现不仅为计算法律分析提供了方法论进展,也为理解司法推理提供了实践洞察,对寻求预测工具的法律从业者、研究法律原则适用的学者,以及面对版权法不一致性的政策制定者都具有重要意义。