Large Language Models in Cryptocurrency Securities Cases: Can a GPT Model Meaningfully Assist Lawyers?

Large Language Models (LLMs) could be a useful tool for lawyers. However, empirical research on their effectiveness in conducting legal tasks is scant. We study securities cases involving cryptocurrencies as one of numerous contexts where AI could support the legal process, studying GPT-3.5's legal reasoning and ChatGPT's legal drafting capabilities. We examine whether a) GPT-3.5 can accurately determine which laws are potentially being violated from a fact pattern, and b) whether there is a difference in juror decision-making based on complaints written by a lawyer compared to ChatGPT. We feed fact patterns from real-life cases to GPT-3.5 and evaluate its ability to determine correct potential violations from the scenario and exclude spurious violations. Second, we had mock jurors assess complaints written by ChatGPT and lawyers. GPT-3.5's legal reasoning skills proved weak, though we expect improvement in future models, particularly given the violations it suggested tended to be correct (it merely missed additional, correct violations). ChatGPT performed better at legal drafting, and jurors' decisions were not statistically significantly associated with the author of the document upon which they based their decisions. Because GPT-3.5 cannot satisfactorily conduct legal reasoning tasks, it would be unlikely to be able to help lawyers in a meaningful way at this stage. However, ChatGPT's drafting skills (though, perhaps, still inferior to lawyers) could assist lawyers in providing legal services. Our research is the first to systematically study an LLM's legal drafting and reasoning capabilities in litigation, as well as in securities law and cryptocurrency-related misconduct.

翻译：大型语言模型（LLMs）可能成为律师的有用工具。然而，关于它们在法律任务中有效性的实证研究十分匮乏。我们以涉及加密货币的证券案件为背景，研究AI在法律流程中的潜在支持作用，重点评估GPT-3.5的法律推理能力和ChatGPT的法律文书撰写能力。我们考察：a) GPT-3.5能否准确根据事实模式判断可能违反的法律；b) 基于律师撰写的起诉状与ChatGPT生成的起诉状，陪审员的决策是否存在差异。我们向GPT-3.5输入真实案例的事实模式，评估其从情境中准确识别潜在违法行为并排除无关指控的能力。随后，我们让模拟陪审员评估由ChatGPT和律师撰写的起诉状。结果显示，GPT-3.5的法律推理能力较弱（尽管我们预期未来模型会改进，尤其是其提出的违法指控往往正确，只是遗漏了额外的正确指控）。ChatGPT在法律文书撰写方面表现更优，且陪审员的决策与文书作者身份无显著统计关联。由于GPT-3.5目前无法令人满意地完成法律推理任务，现阶段它不太可能为律师提供实质性帮助。然而，ChatGPT的文书撰写能力（尽管可能仍逊于律师）仍能协助律师提供法律服务。本研究首次系统评估了LLM在诉讼、证券法及加密货币相关违规行为中的法律文书撰写与推理能力。