Communication barriers pose significant challenges for individuals with hearing and speech impairments, often limiting their ability to effectively interact in everyday environments. This project introduces a real-time assistive technology solution that leverages advanced deep learning techniques to translate sign language gestures into textual and audible speech. By employing convolution neural networks (CNN) trained on the Sign Language MNIST dataset, the system accurately classifies hand gestures captured live via webcam. Detected gestures are instantaneously translated into their corresponding meanings and transcribed into spoken language using text-to-speech synthesis, thus facilitating seamless communication. Comprehensive experiments demonstrate high model accuracy and robust real-time performance with some latency, highlighting the system's practical applicability as an accessible, reliable, and user-friendly tool for enhancing the autonomy and integration of sign language users in diverse social settings.
翻译:沟通障碍给听力和言语障碍人士带来了重大挑战,往往限制他们在日常环境中有效互动的能力。本项目提出了一种实时辅助技术解决方案,利用先进的深度学习技术将手语手势转化为文本及可听语音。通过采用基于手语MNIST数据集训练的卷积神经网络(CNN),系统能够对网络摄像头实时捕捉的手势进行精确分类。检测到的手势会即时翻译为对应语义,并借助文本转语音技术合成为口语输出,从而实现无缝沟通。综合实验表明,该系统在具备一定延迟的情况下仍能保持较高的模型准确率和稳健的实时性能,突显了其作为易用、可靠、用户友好的工具在实际应用中的价值,有助于提升手语使用者在多元社会场景中的自主性与融合度。