In the recently proposed Lace framework for collective entity resolution, logical rules and constraints are used to identify pairs of entity references (e.g. author or paper ids) that denote the same entity. This identification is global: all occurrences of those entity references (possibly across multiple database tuples) are deemed equal and can be merged. By contrast, a local form of merge is often more natural when identifying pairs of data values, e.g. some occurrences of 'J. Smith' may be equated with 'Joe Smith', while others should merge with 'Jane Smith'. This motivates us to extend Lace with local merges of values and explore the computational properties of the resulting formalism.
翻译:在近期提出的用于集体实体消解的Lace框架中,利用逻辑规则与约束识别指向同一实体的实体引用对(例如作者或论文标识符)。这种识别是全局性的:这些实体引用的所有出现(可能跨多个数据库元组)均被视为等价并可合并。相比之下,当识别数据值对时,局部形式的合并往往更为自然——例如,某些'J. Smith'的出现可能与'Joe Smith'等同,而其他出现则应与'Jane Smith'合并。这促使我们扩展Lace框架以支持值的局部合并,并探究由此产生的新形式化表述的计算性质。