Hedging is a strategy for softening the impact of a statement in conversation. In reducing the strength of an expression, it may help to avoid embarrassment (more technically, ``face threat'') to one's listener. For this reason, it is often found in contexts of instruction, such as tutoring. In this work, we develop a model of hedge generation based on i) fine-tuning state-of-the-art language models trained on human-human tutoring data, followed by ii) reranking to select the candidate that best matches the expected hedging strategy within a candidate pool using a hedge classifier. We apply this method to a natural peer-tutoring corpus containing a significant number of disfluencies, repetitions, and repairs. The results show that generation in this noisy environment is feasible with reranking. By conducting an error analysis for both approaches, we reveal the challenges faced by systems attempting to accomplish both social and task-oriented goals in conversation.
翻译:模糊限制语是一种在对话中缓和表达冲击的策略。通过降低表达的强度,它有助于避免让听者感到尴尬(更技术性地称为“面子威胁”)。因此,它在教学场景(如辅导)中经常出现。在本研究中,我们开发了一个模糊限制语生成模型,基于以下两个步骤:(i) 微调在人类辅导对话数据上训练的最先进语言模型,随后 (ii) 使用模糊限制语分类器对候选池中的候选进行重排序,以选择最符合预期模糊策略的候选。我们将此方法应用于一个包含大量不流畅、重复和修正的自然同伴辅导语料库。结果表明,在这种噪声环境中,通过重排序生成是可行的。通过对两种方法进行错误分析,我们揭示了系统在对话中试图同时实现社交目标和任务导向目标时面临的挑战。