Stroke Extraction of Chinese Character Based on Deep Structure Deformable Image Registration

Stroke extraction of Chinese characters plays an important role in the field of character recognition and generation. The most existing character stroke extraction methods focus on image morphological features. These methods usually lead to errors of cross strokes extraction and stroke matching due to rarely using stroke semantics and prior information. In this paper, we propose a deep learning-based character stroke extraction method that takes semantic features and prior information of strokes into consideration. This method consists of three parts: image registration-based stroke registration that establishes the rough registration of the reference strokes and the target as prior information; image semantic segmentation-based stroke segmentation that preliminarily separates target strokes into seven categories; and high-precision extraction of single strokes. In the stroke registration, we propose a structure deformable image registration network to achieve structure-deformable transformation while maintaining the stable morphology of single strokes for character images with complex structures. In order to verify the effectiveness of the method, we construct two datasets respectively for calligraphy characters and regular handwriting characters. The experimental results show that our method strongly outperforms the baselines. Code is available at https://github.com/MengLi-l1/StrokeExtraction.

翻译：汉字笔画提取在字符识别与生成领域具有重要作用。现有的大多数笔画提取方法主要依赖图像形态特征，由于较少利用笔画语义与先验信息，常导致交叉笔画提取及笔画匹配错误。本文提出一种基于深度学习的汉字笔画提取方法，融合了笔画的语义特征与先验信息。该方法包含三个部分：基于图像配准的笔画配准，建立参考笔画与目标笔画间的粗略配准作为先验信息；基于图像语义分割的笔画分割，将目标笔画初步划分为七类；以及单笔画高精度提取。在笔画配准中，我们提出一种结构可变形图像配准网络，在保持复杂结构字符图像中单笔画形态稳定的同时，实现结构可变形变换。为验证方法有效性，我们分别构建了书法字与规则手写字两个数据集。实验结果表明，本方法显著优于基线模型。代码可见于https://github.com/MengLi-l1/StrokeExtraction。

相关内容

图像配准

关注 810

图像配准是图像处理研究领域中的一个典型问题和技术难点，其目的在于比较或融合针对同一对象在不同条件下获取的图像，例如图像会来自不同的采集设备，取自不同的时间，不同的拍摄视角等等，有时也需要用到针对不同对象的图像配准问题。具体地说，对于一组图像数据集中的两幅图像，通过寻找一种空间变换把一幅图像映射到另一幅图像，使得两图中对应于空间同一位置的点一一对应起来，从而达到信息融合的目的。该技术在计算机视觉、医学图像处理以及材料力学等领域都具有广泛的应用。根据具体应用的不同，有的侧重于通过变换结果融合两幅图像，有的侧重于研究变换本身以获得对象的一些力学属性。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日