Semi-Supervised Learning via Cross-Prediction-Powered Inference for Wireless Systems

In many wireless application scenarios, acquiring labeled data can be prohibitively costly, requiring complex optimization processes or measurement campaigns. Semi-supervised learning leverages unlabeled samples to augment the available dataset by assigning synthetic labels obtained via machine learning (ML)-based predictions. However, treating the synthetic labels as true labels may yield worse-performing models as compared to models trained using only labeled data. Inspired by the recently developed prediction-powered inference (PPI) framework, this work investigates how to leverage the synthetic labels produced by an ML model, while accounting for the inherent bias concerning true labels. To this end, we first review PPI and its recent extensions, namely tuned PPI and cross-prediction-powered inference (CPPI). Then, we introduce two novel variants of PPI. The first, referred to as tuned CPPI, provides CPPI with an additional degree of freedom in adapting to the quality of the ML-based labels. The second, meta-CPPI (MCPPI), extends tuned CPPI via the joint optimization of the ML labeling models and of the parameters of interest. Finally, we showcase two applications of PPI-based techniques in wireless systems, namely beam alignment based on channel knowledge maps in millimeter-wave systems and received signal strength information-based indoor localization. Simulation results show the advantages of PPI-based techniques over conventional approaches that rely solely on labeled data or that apply standard pseudo-labeling strategies from semi-supervised learning. Furthermore, the proposed tuned CPPI method is observed to guarantee the best performance among all benchmark schemes, especially in the regime of limited labeled data.

翻译：在许多无线应用场景中，获取标注数据的成本可能极其高昂，需要复杂的优化过程或测量活动。半监督学习通过利用机器学习（ML）预测获得的合成标签来为未标注样本分配标签，从而扩充可用数据集。然而，将合成标签视为真实标签可能会导致模型性能比仅使用标注数据训练的模型更差。受近期发展的预测驱动推理（PPI）框架启发，本研究探讨如何利用机器学习模型生成的合成标签，同时考虑其相对于真实标签的固有偏差。为此，我们首先回顾了PPI及其最新扩展，即调谐PPI和交叉预测驱动推理（CPPI）。随后，我们提出了两种新颖的PPI变体。第一种称为调谐CPPI，为CPPI提供了一个额外的自由度以适应基于ML的标签质量。第二种，元CPPI（MCPPI），通过联合优化ML标注模型与目标参数，扩展了调谐CPPI。最后，我们展示了基于PPI的技术在无线系统中的两个应用：基于毫米波系统信道知识图的波束对准，以及基于接收信号强度信息的室内定位。仿真结果表明，基于PPI的技术相较于仅依赖标注数据的传统方法或应用标准半监督学习伪标签策略的方法具有优势。此外，所提出的调谐CPPI方法在所有基准方案中表现出最佳性能，尤其在标注数据有限的场景下。