Despite the inherently fuzzy nature of reconstructions in historical linguistics, most scholars do not represent their uncertainty when proposing proto-forms. With the increasing success of recently proposed approaches to automating certain aspects of the traditional comparative method, the formal representation of proto-forms has also improved. This formalization makes it possible to address both the representation and the computation of uncertainty. Building on recent advances in supervised phonological reconstruction, during which an algorithm learns how to reconstruct words in a given proto-language relying on previously annotated data, and inspired by improved methods for automated word prediction from cognate sets, we present a new framework that allows for the representation of uncertainty in linguistic reconstruction and also includes a workflow for the computation of fuzzy reconstructions from linguistic data.
翻译:尽管历史语言学中的重建本质上具有模糊性,大多数学者在提出原始形式时并未表达其不确定性。随着近期自动化传统比较法某些方面的方法取得日益成功,原始形式的形式化表示也得到了改进。这种形式化使得处理不确定性的表示与计算成为可能。基于监督音系重建的最新进展(算法依赖先前标注的数据学习如何重建给定原始语言中的词汇),并受从同源词集进行自动词汇预测改进方法的启发,我们提出了一种新框架,该框架允许表示语言重建中的不确定性,并包含从语言数据计算模糊重建的工作流程。