Computational replication of Chinese calligraphy remains challenging. Existing methods falter, either creating high-quality isolated characters while ignoring page-level aesthetics like ligatures and spacing, or attempting page synthesis at the expense of calligraphic correctness. We introduce \textbf{UniCalli}, a unified diffusion framework for column-level recognition and generation. Training both tasks jointly is deliberate: recognition constrains the generator to preserve character structure, while generation provides style and layout priors. This synergy fosters concept-level abstractions that improve both tasks, especially in limited-data regimes. We curated a dataset of over 8,000 digitized pieces, with ~4,000 densely annotated. UniCalli employs asymmetric noising and a rasterized box map for spatial priors, trained on a mix of synthetic, labeled, and unlabeled data. The model achieves state-of-the-art generative quality with superior ligature continuity and layout fidelity, alongside stronger recognition. The framework successfully extends to other ancient scripts, including Oracle bone inscriptions and Egyptian hieroglyphs. Code and data can be viewed in \href{https://github.com/EnVision-Research/UniCalli}{this URL}.
翻译:中文书法的计算复现仍具挑战性。现有方法存在不足:要么能生成高质量单字但忽略连笔、间距等页面级美学特征,要么尝试页面合成却牺牲了书法规范性。我们提出 **UniCalli**,一个用于列级识别与生成的统一扩散框架。联合训练两项任务是经过深思熟虑的:识别任务约束生成器以保持字符结构,而生成任务则提供风格与布局先验。这种协同作用促进了概念级抽象能力的形成,从而提升两项任务的性能,在有限数据场景下尤为显著。我们构建了一个包含8,000余件数字化作品的数据集,其中约4,000件进行了密集标注。UniCalli采用非对称噪声机制与栅格化框体图来编码空间先验,并在合成数据、标注数据及未标注数据的混合集上进行训练。该模型在生成质量上达到领先水平,具有优异的连笔连续性与布局保真度,同时实现了更强的识别性能。本框架已成功扩展至甲骨文、埃及象形文字等其他古文字体系。代码与数据可通过 \href{https://github.com/EnVision-Research/UniCalli}{此链接} 查看。