Vision Transformers (ViTs) have demonstrated remarkable success in achieving state-of-the-art performance across various image-based tasks and beyond. In this study, we employ a ViT-based neural network to address the problem of indoor pathloss radio map prediction. The network's generalization ability is evaluated across diverse settings, including unseen buildings, frequencies, and antennas with varying radiation patterns. By leveraging extensive data augmentation techniques and pretrained DINOv2 weights, we achieve promising results, even under the most challenging scenarios.
翻译:视觉Transformer(ViTs)已在各类图像任务及其他领域展现出卓越性能,实现了最先进的结果。本研究采用基于ViT的神经网络解决室内路径损耗无线电地图预测问题。该网络的泛化能力在多样化场景中得到评估,包括未见过的建筑、不同频率以及具有不同辐射方向图的天线。通过利用广泛的数据增强技术和预训练的DINOv2权重,即使在最具挑战性的场景下,我们也取得了令人满意的预测结果。