Document processing automation remains a critical challenge in enterprise environments, where traditional manual approaches are labor-intensive and error-prone. We present MADP, a multi-agent architecture that addresses the challenge of automating document processing in enterprise settings by combining deep learning-based classification and parsing with large language model extraction, while maintaining accuracy through selective human validation. Our system integrates five specialized agents--Classificator, Splitter, Parser, Extraction, and Validator--with a Human-in-the-Loop (HITL) mechanism and a novel Prompt Fine Tuning with Feedback Inheritance (PFTFI) approach. The operational analysis on a production use-case scenario of 100,000 invoices per year indicates a potential reduction of Full-Time Equivalent (FTE) requirements by approximately 70%. Production deployment on 955 real-world documents processed through January 2026 achieves a 97.0% full-pipeline automation rate, with only 3% requiring non-AI fallback. Ablation evaluation on a stratified 100-document subset (5 documents per each of 20 supplier/document-type categories) demonstrates that the full MADP configuration with Human-in-the-Loop supervision attains 98.5% document-level accuracy. Additionally, we present a comprehensive sustainability analysis showing that our hybrid AI+HITL approach reduces CO2 emissions by 69%, energy consumption by 69%, and water usage by 63% compared to traditional manual processing. Benchmark comparisons of multiple LLM backends (Granite-Docling, Mistral-Small, DeepSeek-OCR) provide practical insights for deployment in production environments.
翻译:文档处理自动化仍然是企业环境中的关键挑战,传统人工方法劳动密集且易出错。我们提出MADP,一种多智能体架构,通过将基于深度学习的分类与解析技术同大语言模型提取相结合,并借助选择性人工验证来保持准确性,从而解决企业场景下文档处理自动化的难题。系统整合了五个专门化智能体——分类器、分割器、解析器、提取器和验证器——并引入人在回路(HITL)机制和一种新颖的带反馈继承的提示微调(PFTFI)方法。基于年产10万张发票的生产级用例运营分析表明,全时等效人力(FTE)需求可降低约70%。在截至2026年1月处理的955份真实世界文档的生产部署中,实现了97.0%的完整流水线自动化率,仅3%需要非人工智能回退。对按分层抽样的100份文档子集(覆盖20个供应商/文档类型类别,每类5份文档)进行的消融评估显示,采用人在回路监督的完整MADP配置达到了98.5%的文档级准确率。此外,我们提供了一项全面的可持续性分析,表明与传统人工处理相比,我们的混合AI+HITL方法可减少69%的二氧化碳排放、69%的能源消耗和63%的用水量。针对多个LLM后端(Granite-Docling、Mistral-Small、DeepSeek-OCR)的基准比较为生产环境中的部署提供了实用见解。