In this paper, we propose a pipeline leveraging Large Language Models (LLMs) for data augmentation in Information Extraction tasks within the legal domain. The proposed method is both simple and effective, significantly reducing the manual effort required for data annotation while enhancing the robustness of Information Extraction systems. Furthermore, the method is generalizable, making it applicable to various Natural Language Processing (NLP) tasks beyond the legal domain.
翻译:本文提出了一种利用大型语言模型(LLM)在法律领域信息抽取任务中进行数据增强的流程方法。该方案设计简洁且效果显著,在提升信息抽取系统鲁棒性的同时,大幅降低了数据标注所需的人工成本。此外,该方法具有良好的泛化能力,可扩展应用于法律领域之外的其他自然语言处理(NLP)任务。