CUIfy XR：一款用于在扩展现实中嵌入基于大语言模型的对话式智能体的开源工具包 (CUIfy the XR: An Open-Source Package to Embed LLM-powered Conversational Agents in XR)

Recent developments in computer graphics, machine learning, and sensor technologies enable numerous opportunities for extended reality (XR) setups for everyday life, from skills training to entertainment. With large corporations offering consumer-grade head-mounted displays (HMDs) in an affordable way, it is likely that XR will become pervasive, and HMDs will develop as personal devices like smartphones and tablets. However, having intelligent spaces and naturalistic interactions in XR is as important as technological advances so that users grow their engagement in virtual and augmented spaces. To this end, large language model (LLM)--powered non-player characters (NPCs) with speech-to-text (STT) and text-to-speech (TTS) models bring significant advantages over conventional or pre-scripted NPCs for facilitating more natural conversational user interfaces (CUIs) in XR. In this paper, we provide the community with an open-source, customizable, extensible, and privacy-aware Unity package, CUIfy, that facilitates speech-based NPC-user interaction with various LLMs, STT, and TTS models. Our package also supports multiple LLM-powered NPCs per environment and minimizes the latency between different computational models through streaming to achieve usable interactions between users and NPCs. We publish our source code in the following repository: https://gitlab.lrz.de/hctl/cuify

翻译：近年来，计算机图形学、机器学习与传感器技术的进步为扩展现实（XR）在日常生活中的应用（从技能培训到娱乐）创造了众多机遇。随着大型企业以可承受的价格推出消费级头戴式显示器（HMD），XR技术有望实现普及，HMD将发展成为类似智能手机和平板电脑的个人设备。然而，要在XR中实现智能空间和自然交互，与技术发展同等重要，这有助于提升用户在虚拟和增强空间中的参与度。为此，基于大语言模型（LLM）并集成语音转文本（STT）与文本转语音（TTS）模型的非玩家角色（NPC），相较于传统或预设脚本的NPC，在促进XR中更自然的对话式用户界面（CUI）方面具有显著优势。本文向社区提供一款开源、可定制、可扩展且注重隐私的Unity工具包——CUIfy，该工具包支持基于语音的NPC-用户交互，兼容多种LLM、STT与TTS模型。我们的工具包还支持每个环境中部署多个基于LLM的NPC，并通过流式处理最小化不同计算模型间的延迟，以实现用户与NPC之间可用的交互体验。源代码发布于以下仓库：https://gitlab.lrz.de/hctl/cuify

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/