This paper explores the concept of leveraging generative AI as a mapping assistant for enhancing the efficiency of collaborative mapping. We present results of an experiment that combines multiple sources of volunteered geographic information (VGI) and large language models (LLMs). Three analysts described the content of crowdsourced Mapillary street-level photographs taken along roads in a small test area in Miami, Florida. GPT-3.5-turbo was instructed to suggest the most appropriate tagging for each road in OpenStreetMap (OSM). The study also explores the utilization of BLIP-2, a state-of-the-art multimodal pre-training method as an artificial analyst of street-level photographs in addition to human analysts. Results demonstrate two ways to effectively increase the accuracy of mapping suggestions without modifying the underlying AI models: by (1) providing a more detailed description of source photographs, and (2) combining prompt engineering with additional context (e.g. location and objects detected along a road). The first approach increases the suggestion accuracy by up to 29%, and the second one by up to 20%.
翻译:本文探索了利用生成式AI作为制图助手以提升协同制图效率的概念。我们展示了一项结合多源自发地理信息(VGI)与大语言模型(LLM)的实验结果。三位分析人员描述了在佛罗里达州迈阿密一个小型测试区域内沿道路采集的众包Mapillary街景照片内容,并指示GPT-3.5-turbo为OpenStreetMap(OSM)中的每条道路推荐最合适的标注。研究还探讨了将BLIP-2(一种先进的多模态预训练方法)作为街景照片的人工分析辅助手段的应用。结果表明,在不修改底层AI模型的情况下,可通过两种方式有效提升制图建议的准确性:(1)提供源照片更详细的描述;(2)将提示工程与额外上下文(如道路沿线位置和检测到的物体)相结合。第一种方法将建议准确率提升高达29%,第二种方法提升高达20%。