With the advances of deep learning techniques, text generation is attracting increasing interest in the artificial intelligence (AI) community, because of its wide applications and because it is an essential component of AI. Traditional text generation systems are trained in a supervised way, requiring massive labeled parallel corpora. In this paper, I will introduce our recent work on search and learning approaches to unsupervised text generation, where a heuristic objective function estimates the quality of a candidate sentence, and discrete search algorithms generate a sentence by maximizing the search objective. A machine learning model further learns from the search results to smooth out noise and improve efficiency. Our approach is important to the industry for building minimal viable products for a new task; it also has high social impacts for saving human annotation labor and for processing low-resource languages.
翻译:随着深度学习技术的进步,文本生成因其广泛的应用场景及作为人工智能核心组件的地位,正引起学界的日益关注。传统文本生成系统采用监督学习范式,依赖大规模标注平行语料库。本文将介绍我们在无监督文本生成领域关于搜索与学习方法的最新研究:通过启发式目标函数评估候选语句质量,并利用离散搜索算法通过最大化搜索目标生成语句;机器学习模型进一步从搜索结果中学习,以平滑噪声并提升效率。该方法对工业界构建新任务的可行性产品具有重要意义,同时在减少人工标注成本、处理低资源语言等方面具有显著社会效益。