FoldingNet Autoencoder model to create a geospatial grouping of CityGML building dataset

Explainable numerical representations or latent information of otherwise complex datasets are more convenient to analyze and study. These representations assist in identifying clusters and outliers, assess similar data points, and explore and interpolate data. Dataset of three-dimensional (3D) building models possesses inherent complexity in various footprint shapes, distinct roof types, walls, height, and volume. Traditionally, grouping similar buildings or 3D shapes requires matching their known properties and shape metrics with each other. However, this requires obtaining a plethora of such properties to calculate similarity. This study, in contrast, utilizes an autoencoder to compute the shape information in a fixed-size vector form that can be compared and grouped with the help of distance metrics. The study uses 'FoldingNet,' a 3D autoencoder, to generate the latent representation of each building from the obtained LoD 2 CityGML dataset. The efficacy of the embeddings obtained from the autoencoder is further analyzed by dataset reconstruction, latent spread visualization, and hierarchical clustering methods. While the clusters give an overall perspective of the type of build forms, they do not include geospatial information in the clustering. A geospatial model is therefore created to iteratively find the geographical groupings of buildings using cosine similarity approaches in embedding vectors. The German federal states of Brandenburg and Berlin are taken as an example to test the methodology. The output provides a detailed overview of the build forms in the form of semantic topological clusters and geographical groupings. This approach is beneficial and scalable for complex analytics, e.g., in large urban simulations, urban morphological studies, energy analysis, or evaluations of building stock.

翻译：对于复杂数据集的可解释数值表示或潜在信息更便于分析和研究。这些表示有助于识别聚类和异常值，评估相似数据点，以及探索和插值数据。三维建筑模型数据集在足迹形状、屋顶类型、墙体、高度和体积方面具有固有复杂性。传统上，对相似建筑或三维形状进行分组需要将其已知属性和形状指标相互匹配。然而，这需要获取大量此类属性来计算相似性。相反，本研究利用自编码器以固定大小的向量形式计算形状信息，这些向量可通过距离度量进行比较和分组。本研究采用三维自编码器"FoldingNet"从获取的LoD 2 CityGML数据集中生成每个建筑的潜在表示。通过数据集重构、潜在散布可视化和层次聚类方法进一步分析自编码器嵌入向量的有效性。虽然聚类提供了建筑形态类型的整体视角，但聚类过程中未包含地理空间信息。因此，本研究创建了一个地理空间模型，利用嵌入向量中的余弦相似度方法迭代寻找建筑的地理分组。以德国勃兰登堡州和柏林州为例对方法进行测试。输出结果以语义拓扑聚类和地理分组的形式提供了建筑形态的详细概览。该方法对于复杂分析（如大型城市模拟、城市形态研究、能量分析或建筑存量评估）具有实用性和可扩展性。