Deep learning methods have shown strong performance in solving tasks for historical document image analysis. However, despite current libraries and frameworks, programming an experiment or a set of experiments and executing them can be time-consuming. This is why we propose an open-source deep learning framework, DIVA-DAF, which is based on PyTorch Lightning and specifically designed for historical document analysis. Pre-implemented tasks such as segmentation and classification can be easily used or customized. It is also easy to create one's own tasks with the benefit of powerful modules for loading data, even large data sets, and different forms of ground truth. The applications conducted have demonstrated time savings for the programming of a document analysis task, as well as for different scenarios such as pre-training or changing the architecture. Thanks to its data module, the framework also allows to reduce the time of model training significantly.
翻译:深度学习方法在解决历史文档图像分析任务中展现了强劲的性能。然而,尽管现有库和框架丰富,编程执行单个实验或系列实验仍可能耗费大量时间。为此,我们提出一个基于PyTorch Lightning的开源深度学习框架DIVA-DAF,专为历史文档分析设计。预实现的分割与分类等任务可轻松使用或定制。同时,借助强大的数据加载模块(支持大规模数据集及多种形式的真实标注),用户可便捷创建自有任务。实际应用表明,该框架在文档分析任务编程层面,以及预训练或架构调整等不同场景中均能节省时间。得益于其数据模块,该框架还能显著缩短模型训练时间。