Large-scale conversational systems typically rely on a skill-routing component to route a user request to an appropriate skill and interpretation to serve the request. In such system, the agent is responsible for serving thousands of skills and interpretations which create a long-tail distribution due to the natural frequency of requests. For example, the samples related to play music might be a thousand times more frequent than those asking for theatre show times. Moreover, inputs used for ML-based skill routing are often a heterogeneous mix of strings, embedding vectors, categorical and scalar features which makes employing augmentation-based long-tail learning approaches challenging. To improve the skill-routing robustness, we propose an augmentation of heterogeneous skill-routing data and training targeted for robust operation in long-tail data regimes. We explore a variety of conditional encoder-decoder generative frameworks to perturb original data fields and create synthetic training data. To demonstrate the effectiveness of the proposed method, we conduct extensive experiments using real-world data from a commercial conversational system. Based on the experiment results, the proposed approach improves more than 80% (51 out of 63) of intents with less than 10K of traffic instances in the skill-routing replication task.
翻译:大规模对话系统通常依赖技能路由组件将用户请求路由至合适的技能及对应的语义解释。在此类系统中,智能体需处理数千种技能及语义解释,由于请求自然频率差异,这些数据呈现长尾分布。例如,与播放音乐相关的样本频率可能是查询剧院演出时间样本的千倍以上。此外,基于机器学习进行技能路由的输入通常由字符串、嵌入向量、类别特征与标量特征构成异构混合体,这使得基于数据增强的长尾学习方法面临挑战。为提升技能路由鲁棒性,我们提出针对异构技能路由数据的数据增强方法,并设计面向长尾数据场景的鲁棒训练策略。通过探索多种条件式编码器-解码器生成框架,我们对原始数据字段进行扰动并生成合成训练数据。为验证所提方法的有效性,基于商业对话系统的真实数据开展大量实验。实验结果表明,在技能路由复现任务中,该方法能提升超过80%(63个意图中的51个)流量实例少于10K的意图的鲁棒性。