Predicting User Engagement Status for Online Evaluation of Intelligent Assistants

Evaluation of intelligent assistants in large-scale and online settings remains an open challenge. User behavior-based online evaluation metrics have demonstrated great effectiveness for monitoring large-scale web search and recommender systems. Therefore, we consider predicting user engagement status as the very first and critical step to online evaluation for intelligent assistants. In this work, we first proposed a novel framework for classifying user engagement status into four categories -- fulfillment, continuation, reformulation and abandonment. We then demonstrated how to design simple but indicative metrics based on the framework to quantify user engagement levels. We also aim for automating user engagement prediction with machine learning methods. We compare various models and features for predicting engagement status using four real-world datasets. We conducted detailed analyses on features and failure cases to discuss the performance of current models as well as challenges.

翻译：大规模和在线环境中的智能助理评价仍是一个公开的挑战。用户基于行为的在线评价指标在监测大规模网络搜索和建议系统方面显示出极大的效力。因此,我们考虑预测用户参与状况,作为智能助理在线评价的第一步和关键步骤。在这项工作中,我们首先提出一个新的框架,将用户参与状况分为四类:完成、持续、重新制定和放弃。然后我们展示了如何根据量化用户参与水平的框架设计简单但具有指示性的衡量标准。我们还力求用机器学习方法实现用户参与预测自动化。我们用四个真实世界数据集比较预测参与状况的各种模型和特征。我们详细分析了当前模式的特征和失败案例,讨论了当前模式的绩效以及挑战。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

专知会员服务

29+阅读 · 2019年11月2日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日