Foundations and Evaluations in NLP

This memoir explores two fundamental aspects of Natural Language Processing (NLP): the creation of linguistic resources and the evaluation of NLP system performance. Over the past decade, my work has focused on developing a morpheme-based annotation scheme for the Korean language that captures linguistic properties from morphology to semantics. This approach has achieved state-of-the-art results in various NLP tasks, including part-of-speech tagging, dependency parsing, and named entity recognition. Additionally, this work provides a comprehensive analysis of segmentation granularity and its critical impact on NLP system performance. In parallel with linguistic resource development, I have proposed a novel evaluation framework, the jp-algorithm, which introduces an alignment-based method to address challenges in preprocessing tasks like tokenization and sentence boundary detection (SBD). Traditional evaluation methods assume identical tokenization and sentence lengths between gold standards and system outputs, limiting their applicability to real-world data. The jp-algorithm overcomes these limitations, enabling robust end-to-end evaluations across a variety of NLP tasks. It enhances accuracy and flexibility by incorporating linear-time alignment while preserving the complexity of traditional evaluation metrics. This memoir provides key insights into the processing of morphologically rich languages, such as Korean, while offering a generalizable framework for evaluating diverse end-to-end NLP systems. My contributions lay the foundation for future developments, with broader implications for multilingual resource development and system evaluation.

翻译：本回忆录探讨自然语言处理（NLP）的两个基本方面：语言资源的构建与NLP系统性能的评估。在过去十年中，我的工作重点是为韩语开发一种基于语素的标注方案，该方案能够捕捉从形态学到语义学的语言特性。该方法在多种NLP任务中取得了最先进的成果，包括词性标注、依存句法分析和命名实体识别。此外，本研究对分词粒度及其对NLP系统性能的关键影响进行了全面分析。在语言资源开发的同时，我提出了一种新颖的评估框架——jp算法，该框架引入了一种基于对齐的方法，以应对分词和句子边界检测（SBD）等预处理任务中的挑战。传统评估方法假设黄金标准与系统输出具有相同的分词和句子长度，这限制了其在真实数据中的适用性。jp算法克服了这些限制，实现了对各种NLP任务的鲁棒端到端评估。它通过结合线性时间对齐，在保持传统评估指标复杂性的同时，提高了准确性和灵活性。本回忆录为处理形态丰富的语言（如韩语）提供了关键见解，同时为评估多样化的端到端NLP系统提供了一个可推广的框架。我的贡献为未来发展奠定了基础，对多语言资源开发和系统评估具有更广泛的意义。