Large pre-trained sequence models, such as transformer-based architectures, have been recently shown to have the capacity to carry out in-context learning (ICL). In ICL, a decision on a new input is made via a direct mapping of the input and of a few examples from the given task, serving as the task's context, to the output variable. No explicit updates of model parameters are needed to tailor the decision to a new task. Pre-training, which amounts to a form of meta-learning, is based on the observation of examples from several related tasks. Prior work has shown ICL capabilities for linear regression. In this study, we leverage ICL to address the inverse problem of multiple-input and multiple-output (MIMO) equalization based on a context given by pilot symbols. A task is defined by the unknown fading channel and by the signal-to-noise ratio (SNR) level, which may be known. To highlight the practical potential of the approach, we allow for the presence of quantization of the received signals. We demonstrate via numerical results that transformer-based ICL has a threshold behavior, whereby, as the number of pre-training tasks grows, the performance switches from that of a minimum mean squared error (MMSE) equalizer with a prior determined by the pre-trained tasks to that of an MMSE equalizer with the true data-generating prior.
翻译:大型预训练序列模型(如基于Transformer的架构)近期被证实具备执行上下文学习(ICL)的能力。在ICL中,新输入的决策通过将输入与给定任务中的少量示例(作为任务上下文)直接映射到输出变量来实现,无需显式更新模型参数即可适应新任务。预训练作为一种元学习形式,基于观测多个相关任务的示例。先前研究表明ICL在线性回归任务中的可行性。本研究利用ICL解决多输入多输出(MIMO)均衡反问题,其上下文由导频符号提供。任务由未知衰落信道及可能已知的信噪比(SNR)水平定义。为突显该方法的实际应用潜力,我们考虑了接收信号存在量化的情况。数值结果表明,基于Transformer的ICL呈现阈值行为:随着预训练任务数量的增加,其性能从基于预训练任务先验的最小均方误差(MMSE)均衡器过渡到采用真实数据生成先验的MMSE均衡器。