This thesis investigates how natural language understanding and generation with transformer models can benefit from grounding the models with knowledge representations and addresses the following key research questions: (i) Can knowledge of entities extend its benefits beyond entity-centric tasks, such as entity linking? (ii) How can we faithfully and effectively extract such structured knowledge from raw text, especially noisy web text? (iii) How do other types of knowledge, beyond structured knowledge, contribute to improving NLP tasks? Studies in this thesis find that incorporating relevant and up-to-date knowledge of entities benefits fake news detection, and entity-focused code-switching significantly enhances zero-shot cross-lingual transfer on entity-centric tasks. In terms of effective and faithful approaches to extracting structured knowledge, it is observed that integrating negative examples and training with entity planning significantly improves performance. Additionally, it is established that other general forms of knowledge, such as parametric and distilled knowledge, enhance multimodal and multilingual knowledge-intensive tasks. This research shows the tangible benefits of diverse knowledge integration and motivates further exploration in this direction.
翻译:本论文探究如何通过将知识表示融入Transformer模型来提升自然语言理解与生成能力,并重点解决以下关键研究问题:(i) 实体知识能否在实体链接等实体中心任务之外扩展其应用价值?(ii) 如何从原始文本(尤其是含噪网络文本)中可靠且高效地提取结构化知识?(iii) 除结构化知识外,其他类型的知识如何助力改进自然语言处理任务?研究表明,融入相关且最新的实体知识有助于假新闻检测,而聚焦实体的语码转换能显著提升实体中心任务的零样本跨语言迁移效果。在结构化知识的有效且可靠提取方面,研究发现整合负例样本并结合实体规划进行训练可显著提升性能。此外,其他通用知识形式(如参数化知识与蒸馏知识)能够增强多模态及多语言知识密集型任务。本研究揭示了多种知识融合的切实效益,并推动该方向的深入探索。