Vision outlooker improves the performance of vision transformers, which implements a self-attention mechanism by adding an outlook attention, a form of local attention. In natural language processing, as has been the case in computer vision and other domains, transformer-based models constitute the state-of-the-art for most processing tasks. In this domain, too, many authors have argued and demonstrated the importance of local context. We present an outlook attention mechanism, COOL, for natural language processing. COOL, added on top of the self-attention layers of a transformer-based model, encodes local syntactic context considering word proximity and more pair-wise constraints than dynamic convolution used by existing approaches. A comparative empirical performance evaluation of an implementation of COOL with different transformer-based models confirms the opportunity for improvement over a baseline using the original models alone for various natural language processing tasks, including question answering. The proposed approach achieves competitive performance with existing state-of-the-art methods on some tasks.
翻译:视觉观察者通过引入展望注意力(一种局部注意力形式)来增强视觉变压器的自注意力机制。在自然语言处理领域,如同计算机视觉及其他领域一样,基于变压器的模型已成为大多数处理任务的最先进方案。在该领域中,许多研究者也论证并展示了局部上下文的重要性。我们提出了一种面向自然语言处理的展望注意力机制——COOL。该机制叠加于基于变压器的模型的自注意力层之上,通过考虑词语邻近性及比现有动态卷积方法更多的成对约束,对局部句法上下文进行编码。将COOL实现与不同基于变压器的模型进行比较实证性能评估后,结果证实:在包括问答在内的多种自然语言处理任务中,该方法相较于单纯使用原始模型的基线具有改进潜力。该方案在某些任务上达到了与现有最优方法相竞争的性能。