Large pre-trained sequence models, such as transformer-based architectures, have been recently shown to have the capacity to carry out in-context learning (ICL). In ICL, a decision on a new input is made via a direct mapping of the input and of a few examples from the given task, serving as the task's context, to the output variable. No explicit updates of the model parameters are needed to tailor the decision to a new task. Pre-training, which amounts to a form of meta-learning, is based on the observation of examples from several related tasks. Prior work has shown ICL capabilities for linear regression. In this study, we leverage ICL to address the inverse problem of multiple-input and multiple-output (MIMO) equalization based on a context given by pilot symbols. A task is defined by the unknown fading channel and by the signal-to-noise ratio (SNR) level, which may be known. To highlight the practical potential of the approach, we allow the presence of quantization of the received signals. We demonstrate via numerical results that transformer-based ICL has a threshold behavior, whereby, as the number of pre-training tasks grows, the performance switches from that of a minimum mean squared error (MMSE) equalizer with a prior determined by the pre-trained tasks to that of an MMSE equalizer with the true data-generating prior.
翻译:近期研究表明,大规模预训练序列模型(如基于Transformer的架构)具备进行上下文学习(ICL)的能力。在上下文学习中,对全新输入的决策是通过将该输入与给定任务的少量示例(作为任务上下文)直接映射到输出变量实现的,无需显式更新模型参数来适配新任务决策。这种预训练本质上是元学习的一种形式,基于对多个相关任务示例的观测。已有工作证明了线性回归领域的ICL能力。本研究利用ICL解决多输入多输出(MIMO)均衡这一逆问题,其中上下文由导频符号提供。每个任务由未知的衰落信道和已知的信噪比(SNR)水平定义。为突出该方法的应用潜力,我们允许接收信号存在量化处理。数值结果表明,基于Transformer的ICL呈现阈值行为:随着预训练任务数量的增加,其性能从基于预训练任务先验的最小均方误差(MMSE)均衡器,逐步过渡到基于真实数据生成先验的MMSE均衡器。