Wireframe parsing aims to recover line segments and their junctions to form a structured geometric representation useful for downstream tasks such as Simultaneous Localization and Mapping (SLAM). Existing methods predict lines and junctions separately and reconcile them post-hoc, causing mismatches and reduced robustness. We present Co-PLNet, a point-line collaborative framework that exchanges spatial cues between the two tasks, where early detections are converted into spatial prompts via a Point-Line Prompt Encoder (PLP-Encoder), which encodes geometric attributes into compact and spatially aligned maps. A Cross-Guidance Line Decoder (CGL-Decoder) then refines predictions with sparse attention conditioned on complementary prompts, enforcing point-line consistency and efficiency. Experiments on Wireframe and YorkUrban show consistent improvements in accuracy and robustness, together with favorable real-time efficiency, demonstrating our effectiveness for structured geometry perception.
翻译:线框解析旨在恢复线段及其连接点,以形成可用于下游任务(如同时定位与地图构建,SLAM)的结构化几何表示。现有方法分别预测线和连接点,并在事后进行协调,导致不匹配和鲁棒性降低。我们提出了Co-PLNet,一种点线协作框架,它在两个任务之间交换空间线索,其中早期检测通过点线提示编码器(PLP-Encoder)转换为空间提示,该编码器将几何属性编码为紧凑且空间对齐的映射。随后,交叉引导线解码器(CGL-Decoder)利用基于互补提示的稀疏注意力来细化预测,从而强制点线一致性并提高效率。在Wireframe和YorkUrban数据集上的实验显示,该方法在精度和鲁棒性方面均取得了一致的提升,同时具有良好的实时效率,证明了其在结构化几何感知方面的有效性。