Many industrial sectors rely on well-trained employees that are able to operate complex machinery. In this work, we demonstrate an AI-powered immersive assistance system that supports users in performing complex tasks in industrial environments. Specifically, our system leverages a VR environment that resembles a juice mixer setup. This digital twin of a physical setup simulates complex industrial machinery used to mix preparations or liquids (e.g., similar to the pharmaceutical industry) and includes various containers, sensors, pumps, and flow controllers. This setup demonstrates our system's capabilities in a controlled environment while acting as a proof-of-concept for broader industrial applications. The core components of our multimodal AI assistant are a large language model and a speech-to-text model that process a video and audio recording of an expert performing the task in a VR environment. The video and speech input extracted from the expert's video enables it to provide step-by-step guidance to support users in executing complex tasks. This demonstration showcases the potential of our AI-powered assistant to reduce cognitive load, increase productivity, and enhance safety in industrial environments.
翻译:众多工业领域依赖训练有素的员工来操作复杂机械设备。本研究展示了一种AI赋能的沉浸式辅助系统,该系统可支持用户在工业环境中执行复杂任务。具体而言,我们的系统利用模拟果汁混合器装置的VR环境。该物理装置的数字孪生体模拟了用于混合制剂或液体(例如类似制药行业)的复杂工业机械,包含多种容器、传感器、泵和流量控制器。该装置在受控环境中展示了我们系统的能力,同时为更广泛的工业应用提供了概念验证。我们多模态AI助手的核心组件包括大型语言模型和语音转文本模型,它们可处理专家在VR环境中执行任务的视频与音频记录。从专家视频中提取的视频与语音输入使其能够提供分步指导,以支持用户执行复杂任务。本演示展现了我们AI赋能助手在降低认知负荷、提升生产效率和增强工业环境安全性方面的潜力。