An end-to-end, interactive Deep Learning based Annotation system for cursive and print English handwritten text

With the surging inclination towards carrying out tasks on computational devices and digital mediums, any method that converts a task that was previously carried out manually, to a digitized version, is always welcome. Irrespective of the various documentation tasks that can be done online today, there are still many applications and domains where handwritten text is inevitable, which makes the digitization of handwritten documents a very essential task. Over the past decades, there has been extensive research on offline handwritten text recognition. In the recent past, most of these attempts have shifted to Machine learning and Deep learning based approaches. In order to design more complex and deeper networks, and ensure stellar performances, it is essential to have larger quantities of annotated data. Most of the databases present for offline handwritten text recognition today, have either been manually annotated or semi automatically annotated with a lot of manual involvement. These processes are very time consuming and prone to human errors. To tackle this problem, we present an innovative, complete end-to-end pipeline, that annotates offline handwritten manuscripts written in both print and cursive English, using Deep Learning and User Interaction techniques. This novel method, which involves an architectural combination of a detection system built upon a state-of-the-art text detection model, and a custom made Deep Learning model for the recognition system, is combined with an easy-to-use interactive interface, aiming to improve the accuracy of the detection, segmentation, serialization and recognition phases, in order to ensure high quality annotated data with minimal human interaction.

翻译：随着计算设备和数字媒介上执行任务的趋势日益增强，任何将先前手动执行的任务转化为数字化版本的方法都备受青睐。尽管当今许多文档处理任务可以在线完成，但仍有大量应用和领域离不开手写文本，这使得手写文档的数字化成为一项至关重要的任务。过去几十年间，离线手写文本识别领域已进行了广泛研究。近年来，这些尝试大多转向基于机器学习和深度学习的方法。为了设计更复杂、更深的网络并实现卓越性能，必须拥有更大规模的已标注数据。目前，大多数用于离线手写文本识别的数据库要么是手动标注的，要么是半自动标注且需大量人工参与的。这些过程非常耗时且容易产生人为错误。为解决这一问题，我们提出了一种创新性的完整端到端流水线，利用深度学习与用户交互技术，对印刷体和手写体英文离线手稿进行标注。该方法在架构上结合了基于最先进文本检测模型构建的检测系统与自定义的深度学习识别模型，并搭配易用的交互界面，旨在提高检测、分割、序列化及识别阶段的精度，从而以最少的人工交互确保高质量的标注数据。