In this article we present a novel system for natural language generation (NLG) of Spanish sentences from a minimum set of meaningful words (such as nouns, verbs and adjectives) which, unlike other state-of-the-art solutions, performs the NLG task in a fully automatic way, exploiting both knowledge-based and statistical approaches. Relying on its linguistic knowledge of vocabulary and grammar, the system is able to generate complete, coherent and correctly spelled sentences from the main word sets presented by the user. The system, which was designed to be integrable, portable and efficient, can be easily adapted to other languages by design and can feasibly be integrated in a wide range of digital devices. During its development we also created a supplementary lexicon for Spanish, aLexiS, with wide coverage and high precision, as well as syntactic trees from a freely available definite-clause grammar. The resulting NLG library has been evaluated both automatically and manually (annotation). The system can potentially be used in different application domains such as augmentative communication and automatic generation of administrative reports or news.
翻译:本文提出了一种新颖的自然语言生成(NLG)系统,用于从最小意义词集(如名词、动词和形容词)自动生成西班牙语句子。与其它前沿解决方案不同,该系统以全自动方式执行NLG任务,同时利用了基于知识和统计的方法。凭借其词汇和语法的语言学知识,该系统能够根据用户提供的主要词集生成完整、连贯且拼写正确的句子。该系统在设计上具有可集成性、可移植性和高效性,易于适配其他语言,并可实际集成到广泛的数字设备中。在开发过程中,我们还创建了一个覆盖广泛且精度高的西班牙语补充词典 aLexiS,以及基于自由可用的定子句语法生成的句法树。所得到的NLG库已通过自动和手动(标注)两种方式进行了评估。该系统可潜在地应用于不同领域,如辅助通信、行政报告或新闻的自动生成。