Regular-pattern-sensitive CRFs for Distant Label Interactions

Linear-chain conditional random fields (CRFs) are a common model component for sequence labeling tasks when modeling the interactions between different labels is important. However, the Markov assumption limits linear-chain CRFs to only directly modeling interactions between adjacent labels. Weighted finite-state transducers (FSTs) are a related approach which can be made to model distant label-label interactions, but exact label inference is intractable for these models in the general case, and the task of selecting an appropriate automaton structure for the desired interaction types poses a practical challenge. In this work, we present regular-pattern-sensitive CRFs (RPCRFs), a method of enriching standard linear-chain CRFs with the ability to learn long-distance label interactions which occur in user-specified patterns. This approach allows users to write regular-expression label patterns concisely specifying which types of interactions the model should take into account, allowing the model to learn from data whether and in which contexts these patterns occur. The result can be interpreted alternatively as a CRF augmented with additional, non-local potentials, or as a finite-state transducer whose structure is defined by a set of easily-interpretable patterns. Critically, unlike the general case for FSTs (and for non-chain CRFs), exact training and inference are tractable for many pattern sets. In this work, we detail how a RPCRF can be automatically constructed from a set of user-specified patterns, and demonstrate the model's effectiveness on synthetic data, showing how different types of patterns can capture different nonlocal dependency structures in label sequences.

翻译：线性链条件随机场（CRFs）是序列标注任务中常用的模型组件，尤其在对不同标签间的交互关系进行建模时至关重要。然而，马尔可夫假设限制了线性链CRFs只能直接建模相邻标签间的交互。加权有限状态转换器（FSTs）是一种相关方法，可用于建模远距离标签间交互，但在一般情况下这些模型的精确标签推断是难解的，且为期望的交互类型选择合适的自动机结构也面临实际挑战。本文提出正则模式敏感条件随机场（RPCRFs），该方法通过增强标准线性链CRFs的能力，使其能够学习用户指定模式中出现的远距离标签交互。该方案允许用户通过编写正则表达式标签模式来简明指定模型应考虑的交互类型，使模型能够从数据中学习这些模式是否出现及其出现的上下文环境。其结果可被解释为增加了额外非局部势函数的CRFs，亦可视为由一组易于解释的模式定义结构的有限状态转换器。关键在于，与FSTs（及非链式CRFs）的一般情况不同，对于许多模式集合，精确训练和推断是易处理的。本文详细阐述了如何从用户指定的模式集合自动构建RPCRF，并在合成数据上验证了模型的有效性，展示了不同类型模式如何捕捉标签序列中不同的非局部依赖结构。