Previous work has predominantly focused on monolingual English semantic parsing. We, instead, explore the feasibility of Chinese semantic parsing in the absence of labeled data for Chinese meaning representations. We describe the pipeline of automatically collecting the linearized Chinese meaning representation data for sequential-to sequential neural networks. We further propose a test suite designed explicitly for Chinese semantic parsing, which provides fine-grained evaluation for parsing performance, where we aim to study Chinese parsing difficulties. Our experimental results show that the difficulty of Chinese semantic parsing is mainly caused by adverbs. Realizing Chinese parsing through machine translation and an English parser yields slightly lower performance than training a model directly on Chinese data.
翻译:先前的工作主要聚焦于单语英语语义解析。本研究转而探索在缺乏标注数据的情况下进行中文语义解析的可行性。我们描述了自动收集线性化中文语义表示数据的流程,以适配序列到序列神经网络。我们进一步提出了一套专门针对中文语义解析设计的测试套件,该套件提供对解析性能的细粒度评估,旨在研究中文解析的难点。实验结果表明,中文语义解析的困难主要由副词导致。通过机器翻译结合英语解析器实现中文解析的效果略低于直接在中文数据上训练模型。