Speech recognition is a critical task in the field of artificial intelligence and has witnessed remarkable advancements thanks to large and complex neural networks, whose training process typically requires massive amounts of labeled data and computationally intensive operations. An alternative paradigm, reservoir computing, is energy efficient and is well adapted to implementation in physical substrates, but exhibits limitations in performance when compared to more resource-intensive machine learning algorithms. In this work we address this challenge by investigating different architectures of interconnected reservoirs, all falling under the umbrella of deep reservoir computing. We propose a photonic-based deep reservoir computer and evaluate its effectiveness on different speech recognition tasks. We show specific design choices that aim to simplify the practical implementation of a reservoir computer while simultaneously achieving high-speed processing of high-dimensional audio signals. Overall, with the present work we hope to help the advancement of low-power and high-performance neuromorphic hardware.
翻译:语音识别是人工智能领域中的一项关键任务,得益于大规模复杂神经网络的广泛应用,该领域已取得显著进展。然而,这类网络的训练过程通常需要海量标注数据和高计算强度的运算。作为一种替代范式,储层计算具有能效高、易于在物理平台上实现的优势,但与资源密集型机器学习算法相比,其性能仍存在局限性。本研究通过探索不同互联储层架构(均属于深度储层计算范畴)来应对这一挑战。我们提出了一种基于光子的深度储层计算机,并在多种语音识别任务中评估其有效性。本文展示了旨在简化储层计算机实际实现的同时,实现高维音频信号高速处理的具体设计策略。总体而言,本研究有望推动低功耗、高性能神经形态硬件的发展。