Ever since the development of GPT-3 in the natural language processing (NLP) field, in-context learning (ICL) has played an important role in utilizing large language models (LLMs). By presenting the LM utterance-label demonstrations at the input, the LM can accomplish few-shot learning without relying on gradient descent or requiring explicit modification of its parameters. This enables the LM to learn and adapt in a black-box manner. Despite the success of ICL in NLP, little work is exploring the possibility of ICL in speech processing. This study proposes the first exploration of ICL with a speech LM without text supervision. We first show that the current speech LM does not have the ICL capability. With the proposed warmup training, the speech LM can, therefore, perform ICL on unseen tasks. In this work, we verify the feasibility of ICL for speech LM on speech classification tasks.
翻译:自自然语言处理(NLP)领域的GPT-3模型问世以来,上下文学习(ICL)在利用大型语言模型(LLM)方面一直发挥着重要作用。通过在输入中呈现语言模型的话语标签示例,该模型能够在不依赖梯度下降或显式修改参数的情况下完成少样本学习,从而以黑盒方式实现学习和适应。尽管ICL在NLP领域取得了成功,但鲜有研究探索其在语音处理中的可能性。本研究首次提出了一种无需文本监督的语音语言模型上下文学习探索。我们首先证明当前语音语言模型不具备ICL能力。通过提出的预热训练方法,语音语言模型能够对未见任务执行ICL。本工作验证了语音语言模型在语音分类任务中应用ICL的可行性。