Developing socially competent robots requires tight integration of robotics, computer vision, speech processing, and web technologies. We present the Socially-interactive Robot Software platform (SROS), an open-source framework addressing this need through a modular layered architecture. SROS bridges the Robot Operating System (ROS) layer for mobility with web and Android interface layers using standard messaging and APIs. Specialized perceptual and interactive skills are implemented as ROS services for reusable deployment on any robot. This facilitates rapid prototyping of collaborative behaviors that synchronize perception with physical actuation. We experimentally validated core SROS technologies including computer vision, speech processing, and GPT2 autocomplete speech implemented as plug-and-play ROS services. Modularity is demonstrated through the successful integration of an additional ROS package, without changes to hardware or software platforms. The capabilities enabled confirm SROS's effectiveness in developing socially interactive robots through synchronized cross-domain interaction. Through demonstrations showing synchronized multimodal behaviors on an example platform, we illustrate how the SROS architectural approach addresses shortcomings of previous work by lowering barriers for researchers to advance the state-of-the-art in adaptive, collaborative customizable human-robot systems through novel applications integrating perceptual and social abilities.
翻译:开发具备社交能力的机器人需要紧密整合机器人技术、计算机视觉、语音处理及网络技术。本文提出社交交互机器人软件平台(SROS),这是一个通过模块化分层架构满足上述需求的开源框架。SROS通过标准消息传递与应用程序接口,将用于移动能力的机器人操作系统(ROS)层与网络及安卓接口层相连接。专门化的感知与交互技能被实现为ROS服务,可在任何机器人上重复部署。这有助于快速原型开发同步感知与物理驱动行为的协作行为。实验验证了SROS核心技术,包括作为即插即用ROS服务实现的计算机视觉、语音处理及GPT2自动补全语音。通过成功集成额外ROS软件包(无需更改硬件或软件平台)展示了模块化特性。所实现的功能证实了SROS通过跨领域同步交互开发社交交互机器人的有效性。通过在示例平台上展示同步多模态行为的演示,我们阐明了SROS架构方法如何通过整合感知与社交能力的新型应用,降低研究者推进自适应、可协作、可定制人机系统前沿技术的门槛。