This paper presents a novel hybrid Automatic Speech Recognition (ASR) system designed specifically for resource-constrained robots. The proposed approach combines Hidden Markov Models (HMMs) with deep learning models and leverages socket programming to distribute processing tasks effectively. In this architecture, the HMM-based processing takes place within the robot, while a separate PC handles the deep learning model. This synergy between HMMs and deep learning enhances speech recognition accuracy significantly. We conducted experiments across various robotic platforms, demonstrating real-time and precise speech recognition capabilities. Notably, the system exhibits adaptability to changing acoustic conditions and compatibility with low-power hardware, making it highly effective in environments with limited computational resources. This hybrid ASR paradigm opens up promising possibilities for seamless human-robot interaction. In conclusion, our research introduces a pioneering dimension to ASR techniques tailored for robotics. By employing socket programming to distribute processing tasks across distinct devices and strategically combining HMMs with deep learning models, our hybrid ASR system showcases its potential to enable robots to comprehend and respond to spoken language adeptly, even in environments with restricted computational resources. This paradigm sets a innovative course for enhancing human-robot interaction across a wide range of real-world scenarios.
翻译:本文针对资源受限机器人提出了一种新颖的混合自动语音识别(ASR)系统。该方法结合了隐马尔可夫模型(HMM)与深度学习模型,并利用套接字编程有效分配处理任务。在该架构中,基于HMM的处理在机器人内部完成,而PC则负责处理深度学习模型。这种HMM与深度学习的协同作用显著提升了语音识别精度。我们在多种机器人平台上进行了实验,展示了实时且精准的语音识别能力。值得注意的是,该系统展现出对变化声学环境的适应性以及与低功耗硬件的兼容性,使其在计算资源有限的环境中尤为高效。这种混合ASR范式为人机无缝交互开辟了广阔前景。总之,本研究为面向机器人技术的ASR方法引入了一个开创性维度。通过采用套接字编程将处理任务分布在不同的设备上,并策略性地结合HMM与深度学习模型,我们的混合ASR系统展现了其在计算资源受限环境下使机器人能够熟练理解和响应口语的潜力。这一范式为在广泛真实场景中增强人机交互树立了创新方向。