On-line handwritten character segmentation is often associated with handwriting recognition and even though recognition models include mechanisms to locate relevant positions during the recognition process, it is typically insufficient to produce a precise segmentation. Decoupling the segmentation from the recognition unlocks the potential to further utilize the result of the recognition. We specifically focus on the scenario where the transcription is known beforehand, in which case the character segmentation becomes an assignment problem between sampling points of the stylus trajectory and characters in the text. Inspired by the $k$-means clustering algorithm, we view it from the perspective of cluster assignment and present a Transformer-based architecture where each cluster is formed based on a learned character query in the Transformer decoder block. In order to assess the quality of our approach, we create character segmentation ground truths for two popular on-line handwriting datasets, IAM-OnDB and HANDS-VNOnDB, and evaluate multiple methods on them, demonstrating that our approach achieves the overall best results.
翻译:在线手写字符分割常与手写识别相关联,尽管识别模型包含在识别过程中定位相关位置的机制,但通常不足以实现精确分割。将分割与识别解耦,能进一步挖掘识别结果的潜力。我们特别关注转录内容预先已知的场景,此时字符分割转化为手写笔轨迹采样点与文本中字符之间的分配问题。受$k$-均值聚类算法启发,我们从聚类分配的角度审视该问题,并提出一种基于Transformer的架构,其中每个聚类通过Transformer解码器模块中学习的字符查询形成。为评估方法质量,我们为两个流行在线手写数据集IAM-OnDB和HANDS-VNOnDB创建了字符分割基准真值,并在其上评估多种方法,结果表明本方法整体性能最优。