Historic urban quarters are increasingly shaped by tourism and lifestyle consumption, yet planners often lack scalable evidence on what visitors notice, prefer, and criticize in these environments. This study proposes an AI-based, multimodal framework to decode tourist perception by combining visual attention, color-based aesthetic representation, and multidimensional satisfaction. We collect geotagged photos and review texts from a major Chinese platform and assemble a street view image set as a baseline for comparison across 12 historic urban quarters in Shanghai. We train a semantic segmentation model to quantify foregrounded visual elements in tourist-shared imagery, extract and compare color palettes between social media photos and street views, and apply a multi-task sentiment classifier to assess satisfaction across four experience dimensions that correspond to activity, physical setting, supporting services, and commercial offerings. Results show that tourist photos systematically foreground key streetscape elements and that the color composition represented on social media can differ from on-site street views, indicating a perception-reality gap that varies by quarter. The framework offers an interpretable and transferable approach to diagnose such gaps and to inform heritage management and visitor-oriented urban design.
翻译:暂无翻译