We present the InterviewBot that dynamically integrates conversation history and customized topics into a coherent embedding space to conduct 10 mins hybrid-domain (open and closed) conversations with foreign students applying to U.S. colleges for assessing their academic and cultural readiness. To build a neural-based end-to-end dialogue model, 7,361 audio recordings of human-to-human interviews are automatically transcribed, where 440 are manually corrected for finetuning and evaluation. To overcome the input/output size limit of a transformer-based encoder-decoder model, two new methods are proposed, context attention and topic storing, allowing the model to make relevant and consistent interactions. Our final model is tested both statistically by comparing its responses to the interview data and dynamically by inviting professional interviewers and various students to interact with it in real-time, finding it highly satisfactory in fluency and context awareness.
翻译:我们提出了InterviewBot系统,该系统通过动态整合对话历史与定制化主题,构建统一的嵌入空间,从而与申请美国大学的外国学生进行长达10分钟的混合域(开放域与封闭域)对话,以评估其学术与文化适应能力。为构建基于神经网络的端到端对话模型,我们自动转录了7,361段人机面试音频记录,其中440段经人工修正用于微调与评估。为克服基于Transformer的编码器-解码器模型的输入/输出长度限制,我们提出了两种新方法——上下文注意力机制(Context Attention)与主题存储机制(Topic Storing),使模型能够进行相关且一致的交互。最终模型通过两种方式进行了测试:一是将其响应与面试数据进行统计学对比,二是邀请专业面试官及多名学生与其进行实时交互测试。结果表明,该模型在流畅性与上下文感知能力方面具有高度令人满意的表现。