Buildings generate heterogeneous data across their lifecycle, yet integrating these data remains a critical unsolved challenge. Despite three decades of standardization efforts, over 40 metadata schemas now span the building lifecycle, with fragmentation accelerating rather than resolving. Current approaches rely on point-to-point mappings that scale quadratically with the number of schemas, or universal ontologies that become unwieldy monoliths. The fundamental gap is the absence of mathematical foundations for structure-preserving transformations across heterogeneous building data. Here we show that category theory provides these foundations, enabling systematic data integration with $O(n)$ specification complexity for $n$ ontologies. We formalize building ontologies as first-order theories and demonstrate two proof-of-concept implementations in Categorical Query Language (CQL): 1) generating BRICK models from IFC design data at commissioning, and 2) three-way integration of IFC, BRICK, and RealEstateCore where only two explicit mappings yield the third automatically through categorical composition. Our correct-by-construction approach treats property sets as first-class schema entities and provides automated bidirectional migrations, and enables cross-ontology queries. These results establish feasibility of categorical methods for building data integration and suggest a path toward an app ecosystem for buildings, where mathematical foundations enable reliable component integration analogous to smartphone platforms.
翻译:建筑物在其生命周期中产生异构数据,然而整合这些数据仍是一个尚未解决的关键挑战。尽管经历了三十年的标准化努力,目前已有超过40种元数据模式覆盖建筑生命周期,且碎片化趋势正在加剧而非缓解。现有方法依赖于点对点映射(其复杂度随模式数量呈二次方增长)或通用本体(往往演变为难以驾驭的庞然大物)。根本问题在于缺乏跨异构建筑数据的结构保持变换的数学基础。本文证明范畴论可提供此类基础,实现$n$个本体的$O(n)$规范复杂度的系统化数据集成。我们将建筑本体形式化为一阶理论,并在范畴查询语言(CQL)中展示两个概念验证实现:1)在调试阶段从IFC设计数据生成BRICK模型;2)IFC、BRICK与RealEstateCore的三方集成——仅需两个显式映射即可通过范畴复合自动推导出第三个映射。我们基于构造正确性的方法将属性集视作一等模式实体,提供自动化的双向迁移,并支持跨本体查询。这些结果证实了范畴论方法在建筑数据集成中的可行性,并为构建建筑应用生态系统指明方向——数学基础将支持可靠的组件集成,其方式类似于智能手机平台。