We present the InterviewBot that dynamically integrates conversation history and customized topics into a coherent embedding space to conduct 10 mins hybrid-domain (open and closed) conversations with foreign students applying to U.S. colleges for assessing their academic and cultural readiness. To build a neural-based end-to-end dialogue model, 7,361 audio recordings of human-to-human interviews are automatically transcribed, where 440 are manually corrected for finetuning and evaluation. To overcome the input/output size limit of a transformer-based encoder-decoder model, two new methods are proposed, context attention and topic storing, allowing the model to make relevant and consistent interactions. Our final model is tested both statistically by comparing its responses to the interview data and dynamically by inviting professional interviewers and various students to interact with it in real-time, finding it highly satisfactory in fluency and context awareness.
翻译:我们提出了InterviewBot系统,该系统通过将对话历史与定制主题动态整合到统一的嵌入空间中,与申请美国大学的留学生进行10分钟的混合领域(开放域和封闭域)对话,以评估其学术与文化准备程度。为构建基于神经网络的端到端对话模型,我们自动转录了7,361段人机访谈音频记录,其中440段经过人工校正用于微调与评估。为克服基于Transformer的编码器-解码器模型的输入输出长度限制,我们提出了两种新方法——上下文注意力与主题存储机制,使模型能够进行相关且一致的交互。最终模型通过两种方式进行了测试:一方面通过将其响应与访谈数据进行统计对比,另一方面通过邀请专业面试官与不同学生进行实时交互测试,结果显示模型在流畅性和上下文感知方面均具有高度满意度。