Sequence labeling tasks require the computation of sentence representations for each word within a given sentence. A prevalent method incorporates a Bi-directional Long Short-Term Memory (BiLSTM) layer to enhance the sequence structure information. However, empirical evidence Li (2020) suggests that the capacity of BiLSTM to produce sentence representations for sequence labeling tasks is inherently limited. This limitation primarily results from the integration of fragments from past and future sentence representations to formulate a complete sentence representation. In this study, we observed that the entire sentence representation, found in both the first and last cells of BiLSTM, can supplement each the individual sentence representation of each cell. Accordingly, we devised a global context mechanism to integrate entire future and past sentence representations into each cell's sentence representation within the BiLSTM framework. By incorporating the BERT model within BiLSTM as a demonstration, and conducting exhaustive experiments on nine datasets for sequence labeling tasks, including named entity recognition (NER), part of speech (POS) tagging, and End-to-End Aspect-Based sentiment analysis (E2E-ABSA). We noted significant improvements in F1 scores and accuracy across all examined datasets.
翻译:序列标注任务需要计算给定句子中每个单词的句子表示。一种常见的方法是引入双向长短期记忆网络(BiLSTM)层来增强序列结构信息。然而,Li(2020)的经验证据表明,BiLSTM为序列标注任务生成句子表示的能力固有地受到限制。这种限制主要源于从过去和未来的句子表示中整合片段以形成完整句子表示的过程。在本研究中,我们观察到BiLSTM的第一个和最后一个单元中的完整句子表示可以补充每个单元的单独句子表示。因此,我们设计了一种全局上下文机制,将完整的未来和过去句子表示整合到BiLSTM框架中每个单元的句子表示中。通过将BERT模型嵌入BiLSTM作为示范,并在九个数据集上进行序列标注任务的广泛实验,包括命名实体识别(NER)、词性标注(POS)和端到端基于方面的情感分析(E2E-ABSA),我们注意到在所有测试数据集上F1分数和准确率均有显著提升。