The detection of fa\c{c}ade elements on buildings, such as doors, windows, balconies, air conditioning units, billboards, and glass curtain walls, is a critical step in automating the creation of Building Information Modeling (BIM). Yet, this field faces significant challenges, including the uneven distribution of fa\c{c}ade elements, the presence of small objects, and substantial background noise, which hamper detection accuracy. To address these issues, we develop the BFA-YOLO model and the BFA-3D dataset in this study. The BFA-YOLO model is an advanced architecture designed specifically for analyzing multi-view images of fa\c{c}ade attachments. It integrates three novel components: the Feature Balanced Spindle Module (FBSM) that tackles the issue of uneven object distribution; the Target Dynamic Alignment Task Detection Head (TDATH) that enhances the detection of small objects; and the Position Memory Enhanced Self-Attention Mechanism (PMESA), aimed at reducing the impact of background noise. These elements collectively enable BFA-YOLO to effectively address each challenge, thereby improving model robustness and detection precision. The BFA-3D dataset, offers multi-view images with precise annotations across a wide range of fa\c{c}ade attachment categories. This dataset is developed to address the limitations present in existing fa\c{c}ade detection datasets, which often feature a single perspective and insufficient category coverage. Through comparative analysis, BFA-YOLO demonstrated improvements of 1.8\% and 2.9\% in mAP$_{50}$ on the BFA-3D dataset and the public Fa\c{c}ade-WHU dataset, respectively, when compared to the baseline YOLOv8 model. These results highlight the superior performance of BFA-YOLO in fa\c{c}ade element detection and the advancement of intelligent BIM technologies.
翻译:建筑物立面元素(如门窗、阳台、空调外机、广告牌和玻璃幕墙)的检测是自动化创建建筑信息模型(BIM)的关键步骤。然而,该领域面临显著挑战,包括立面元素分布不均、小目标存在以及大量背景噪声,这些因素均会降低检测精度。为解决这些问题,本研究开发了BFA-YOLO模型与BFA-3D数据集。BFA-YOLO模型是一种专为分析立面附着物多视角图像设计的先进架构,其整合了三个创新组件:用于应对目标分布不均问题的特征平衡主轴模块(FBSM);用于增强小目标检测能力的动态目标对齐任务检测头(TDATH);以及旨在降低背景噪声影响的位置记忆增强自注意力机制(PMESA)。这些组件共同使BFA-YOLO能够有效应对各项挑战,从而提升模型鲁棒性与检测精度。BFA-3D数据集提供了涵盖广泛立面附着物类别的多视角图像及精确标注。该数据集的构建旨在解决现有立面检测数据集普遍存在的单视角局限与类别覆盖不足问题。通过对比分析,BFA-YOLO在BFA-3D数据集及公开的Façade-WHU数据集上,相较于基准YOLOv8模型,其mAP$_{50}$指标分别提升了1.8%与2.9%。这些结果凸显了BFA-YOLO在立面元素检测方面的卓越性能,以及对智能BIM技术发展的推动作用。