Urban Physical Disorder (UPD), such as old or abandoned buildings, broken sidewalks, litter, and graffiti, has a negative impact on residents' quality of life. They can also increase crime rates, cause social disorder, and pose a public health risk. Currently, there is a lack of efficient and reliable methods for detecting and understanding UPD. To bridge this gap, we propose UPDExplainer, an interpretable transformer-based framework for UPD detection. We first develop a UPD detection model based on the Swin Transformer architecture, which leverages readily accessible street view images to learn discriminative representations. In order to provide clear and comprehensible evidence and analysis, we subsequently introduce a UPD factor identification and ranking module that combines visual explanation maps with semantic segmentation maps. This novel integrated approach enables us to identify the exact objects within street view images that are responsible for physical disorders and gain insights into the underlying causes. Experimental results on the re-annotated Place Pulse 2.0 dataset demonstrate promising detection performance of the proposed method, with an accuracy of 79.9%. For a comprehensive evaluation of the method's ranking performance, we report the mean Average Precision (mAP), R-Precision (RPrec), and Normalized Discounted Cumulative Gain (NDCG), with success rates of 75.51%, 80.61%, and 82.58%, respectively. We also present a case study of detecting and ranking physical disorders in the southern region of downtown Los Angeles, California, to demonstrate the practicality and effectiveness of our framework.
翻译:城市物理无序(如老旧或废弃建筑、破损人行道、垃圾堆积及涂鸦)对居民生活质量产生负面影响,可能加剧犯罪率、引发社会混乱并构成公共卫生风险。当前尚缺乏高效可靠的检测与理解城市物理无序的方法。为弥补这一空白,我们提出UPDExplainer——一种基于可解释Transformer的城市物理无序检测框架。首先,基于Swin Transformer架构构建UPD检测模型,利用易于获取的街景图像学习判别性表征。为提供清晰可理解的证据与分析,随后引入融合视觉解释图与语义分割图的UPD因子识别与排序模块。这种创新集成方法能精确定位街景图像中引发物理无序的具体对象,并揭示其潜在成因。在重新标注的Place Pulse 2.0数据集上的实验结果表明,所提方法检测性能良好,准确率达79.9%。为全面评估方法的排序性能,我们报告了平均精度均值(mAP)、R精度(RPrec)和归一化折损累计增益(NDCG),其成功率分别为75.51%、80.61%和82.58%。最后通过加州洛杉矶市中心南部区域物理无序检测与排序的案例研究,验证了框架的实用性与有效性。