Recently released open-source pre-trained foundational image segmentation and object detection models (SAM2+GroundingDINO) allow for geometrically consistent segmentation of objects of interest in multi-view 2D images. Users can use text-based or click-based prompts to segment objects of interest without requiring labeled training datasets. Gaussian Splatting allows for the learning of the 3D representation of a scene's geometry and radiance based on 2D images. Combining Google Earth Studio, SAM2+GroundingDINO, 2D Gaussian Splatting, and our improvements in mask refinement based on morphological operations and contour simplification, we created a pipeline to extract the 3D mesh of any building based on its name, address, or geographic coordinates.
翻译:近期发布的开源预训练基础图像分割与物体检测模型(SAM2+GroundingDINO)能够对多视角二维图像中感兴趣物体进行几何一致的分割。用户可通过基于文本或点击的提示来分割感兴趣物体,无需标注训练数据集。高斯泼溅技术能够基于二维图像学习场景几何结构与辐射场的三维表示。结合谷歌地球工作室、SAM2+GroundingDINO、二维高斯泼溅技术,以及我们基于形态学操作与轮廓简化的掩模优化改进,我们构建了一个能够根据建筑名称、地址或地理坐标提取任意建筑三维网格的流程。