Conversational AI systems (e.g. Alexa, Siri, Google Assistant, etc.) need to understand queries with defects to ensure robust conversational understanding and reduce user frictions. The defective queries are often induced by user ambiguities and mistakes, or errors in the automatic speech recognition (ASR) and natural language understanding (NLU). Personalized query rewriting (personalized QR) targets reducing defects in the torso and tail user query traffic, and it typically relies on an index of past successful user interactions with the conversational AI. This paper presents our "Collaborative Query Rewriting" approach that focuses on rewriting novel user interactions unseen in the user history. This approach builds a "user Feedback Interaction Graph" (FIG) consisting of historical user-entity interactions, and leverages multi-hop customer affinity to enrich each user's index (i.e. the Collaborative User Index) that would help cover future unseen defective queries. To counteract the precision degradation from the enlarged index, we introduced additional transformer layers to the L1 retrieval model and added multi-hop affinity and guardrail features to the L2 re-ranking model. Given the production constraints of storage cost and runtime retrieval latency, managing the size of the Collaborative User Index is important. As the user index can be pre-computed, we explored using a Large Language Model (LLM) for multi-hop customer affinity retrieval on the Video/Music domains. In particular, this paper looked into the Dolly-V2 7B model. Given limited user index size, We found the user index derived from fine-tuned Dolly-V2 generation significantly enhanced coverage of unseen user interactions. Consequently, this boosted QR performance on unseen user interactions compared to the graph traversal based user index.
翻译:对话式AI系统(如Alexa、Siri、Google Assistant等)需要理解存在缺陷的查询,以确保鲁棒的对话理解能力并减少用户摩擦。这些有缺陷的查询通常源于用户歧义和错误,或自动语音识别(ASR)与自然语言理解(NLU)产生的误差。个性化查询重写(personalized QR)旨在减少用户查询流量中主干和尾部部分的缺陷,其通常依赖于对话式AI中用户历史成功交互的索引。本文提出一种“协同查询重写”方法,专注于改写用户历史中未见过的全新用户交互。该方法构建由历史用户-实体交互组成的“用户反馈交互图”(FIG),并利用多跳用户亲和性来丰富每个用户的索引(即协同用户索引),从而帮助覆盖未来未见过的缺陷查询。为抵消索引扩大带来的精度损失,我们在L1检索模型中引入额外的Transformer层,并在L2重排序模型中增加多跳亲和性与护栏特征。鉴于存储成本和运行时检索延迟的生产约束,控制协同用户索引的大小至关重要。由于用户索引可预计算,我们探索使用大语言模型(LLM)在视频/音乐领域进行多跳用户亲和性检索。具体而言,本文研究了Dolly-V2 7B模型。在有限用户索引规模下,我们发现基于微调Dolly-V2生成得到的用户索引显著提升了对未见过用户交互的覆盖率。因此,相较于基于图遍历的用户索引,该方法进一步提升了针对未见过用户交互的查询重写性能。