StreetLens: Enabling Human-Centered AI Agents for Neighborhood Assessment from Street View Imagery

Traditionally, neighborhood studies have used interviews, surveys, and manual image annotation guided by detailed protocols to identify environmental characteristics, including physical disorder, decay, street safety, and sociocultural symbols, and to examine their impact on developmental and health outcomes. Although these methods yield rich insights, they are time-consuming and require intensive expert intervention. Recent technological advances, including vision language models (VLMs), have begun to automate parts of this process; however, existing efforts are often ad hoc and lack adaptability across research designs and geographic contexts. In this paper, we present StreetLens, a user-configurable human-centered workflow that integrates relevant social science expertise into a VLM for scalable neighborhood environmental assessments. StreetLens mimics the process of trained human coders by focusing the analysis on questions derived from established interview protocols, retrieving relevant street view imagery (SVI), and generating a wide spectrum of semantic annotations from objective features (e.g., the number of cars) to subjective perceptions (e.g., the sense of disorder in an image). By enabling researchers to define the VLM's role through domain-informed prompting, StreetLens places domain knowledge at the core of the analysis process. It also supports the integration of prior survey data to enhance robustness and expand the range of characteristics assessed in diverse settings. StreetLens represents a shift toward flexible and agentic AI systems that work closely with researchers to accelerate and scale neighborhood studies. StreetLens is publicly available at https://knowledge-computing.github.io/projects/streetlens.

翻译：传统邻里研究通常采用访谈、问卷调查以及基于详细规程的人工图像标注等方法，以识别物理失序、衰败、街道安全和社会文化符号等环境特征，并考察其对发展和健康结果的影响。尽管这些方法能提供丰富的见解，但耗时且需要大量专家介入。近期技术进步，特别是视觉语言模型的发展，已开始实现该过程的局部自动化；然而，现有研究多为临时性方案，缺乏跨研究设计和地理环境的适应性。本文提出StreetLens——一种用户可配置的、以人为中心的工作流程，它将相关社会科学专业知识整合到视觉语言模型中，以实现可扩展的邻里环境评估。StreetLens通过以下方式模拟训练有素的人工编码员的工作流程：聚焦于从成熟访谈规程中提炼的分析问题，检索相关街景图像，并生成从客观特征（如车辆数量）到主观感知（如图像中的失序感）的广泛语义标注。通过允许研究者借助领域知识驱动的提示定义视觉语言模型的任务角色，StreetLens将领域知识置于分析过程的核心。该系统还支持整合先验调查数据，以增强鲁棒性并拓展多样化场景下的特征评估范围。StreetLens代表了向灵活自主的AI系统转变的趋势，这类系统与研究者紧密协作，以加速并扩展邻里研究的规模。StreetLens已在https://knowledge-computing.github.io/projects/streetlens公开提供。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日