This paper explores the concept of leveraging generative AI as a mapping assistant for enhancing the efficiency of collaborative mapping. We present results of an experiment that combines multiple sources of volunteered geographic information (VGI) and large language models (LLMs). Three analysts described the content of crowdsourced Mapillary street-level photographs taken along roads in a small test area in Miami, Florida. GPT-3.5-turbo was instructed to suggest the most appropriate tagging for each road in OpenStreetMap (OSM). The study also explores the utilization of BLIP-2, a state-of-the-art multimodal pre-training method as an artificial analyst of street-level photographs in addition to human analysts. Results demonstrate two ways to effectively increase the accuracy of mapping suggestions without modifying the underlying AI models: by (1) providing a more detailed description of source photographs, and (2) combining prompt engineering with additional context (e.g. location and objects detected along a road). The first approach increases the suggestion accuracy by up to 29%, and the second one by up to 20%.
翻译:本文探讨将生成式AI作为制图助手以提升协同制图效率的概念,并展示了一项融合志愿者地理信息(VGI)与大型语言模型(LLM)的实验结果。三名分析人员描述了美国佛罗里达州迈阿密市小型测试区域内沿路采集的众包Mapillary街景照片内容,通过GPT-3.5-turbo指导其针对OpenStreetMap(OSM)各路段提出最适切的标签建议。研究同时探索了将BLIP-2(一种先进的多模态预训练方法)作为街景照片的人工智能分析员与人类分析员协同使用的可能性。结果表明,在不修改底层AI模型的前提下,可通过两种方式有效提升制图建议的准确率:(1)提供更详细的源照片描述;(2)将提示工程与额外上下文信息(如道路沿线检测到的位置和物体)相结合。第一种方法使建议准确率提升最高达29%,第二种方法提升最高达20%。