In this paper, we incorporate physical knowledge into learning-based high-precision target sensing using the multi-view channel state information (CSI) between multiple base stations (BSs) and user equipment (UEs). Such kind of multi-view sensing problem can be naturally cast into a conditional generation framework. To this end, we design a bipartite neural network architecture, the first part of which uses an elaborately designed encoder to fuse the latent target features embedded in the multi-view CSI, and then the second uses them as conditioning inputs of a powerful generative model to guide the target's reconstruction. Specifically, the encoder is designed to capture the physical correlation between the CSI and the target, and also be adaptive to the numbers and positions of BS-UE pairs. Therein the view-specific nature of CSI is assimilated by introducing a spatial positional embedding scheme, which exploits the structure of electromagnetic(EM)-wave propagation channels. Finally, a conditional diffusion model with a weighted loss is employed to generate the target's point cloud from the fused features. Extensive numerical results demonstrate that the proposed generative multi-view (Gen-MV) sensing framework exhibits excellent flexibility and significant performance improvement on the reconstruction quality of target's shape and EM properties.
翻译:本文通过将物理知识融入基于学习的高精度目标感知,利用多个基站与用户设备之间的多视角信道状态信息。此类多视角感知问题可自然地转化为条件生成框架。为此,我们设计了一种二分神经网络架构:其第一部分采用精心设计的编码器融合多视角CSI中嵌入的潜在目标特征;第二部分将这些特征作为强大生成模型的条件输入,以指导目标的重建。具体而言,该编码器旨在捕获CSI与目标间的物理相关性,并能自适应基站-用户设备对的数量与位置。通过引入空间位置嵌入方案——该方案利用了电磁波传播信道的结构特性——编码器得以吸收CSI的视角特异性。最终,采用带加权损失的条件扩散模型,从融合特征中生成目标的点云。大量数值结果表明,所提出的生成式多视角感知框架在目标形状与电磁特性的重建质量上展现出卓越的灵活性与显著的性能提升。