The building sector plays a crucial role in the worldwide decarbonization effort, accounting for significant portions of energy consumption and environmental effects. However, the scarcity of open data sources is a continuous challenge for built environment researchers and practitioners. Although several efforts have been made to consolidate existing open datasets, no database currently offers a comprehensive collection of building data types with all subcategories and time granularities (e.g., year, month, and sub-hour). This paper presents the Building Data Genome Directory, an open data-sharing platform serving as a one-stop shop for the data necessary for vital categories of building energy research. The data directory is an online portal (http://buildingdatadirectory.org/) that allows filtering and discovering valuable datasets. The directory covers meter, building-level, and aggregated community-level data at the spatial scale and year-to-minute level at the temporal scale. The datasets were consolidated from a comprehensive exploration of sources, including governments, research institutes, and online energy dashboards. The results of this effort include the aggregation of 60 datasets pertaining to building energy ontologies, building energy models, building energy and water data, electric vehicle data, weather data, building information data, text-mining-based research data, image data of buildings, fault detection diagnosis data and occupant data. A crowdsourcing mechanism in the platform allows users to submit datasets they suggest for inclusion by filling out an online form. This directory can fuel research and applications on building energy efficiency, which is an essential step toward addressing the world's energy and environmental challenges.
翻译:建筑部门在全球脱碳努力中扮演关键角色,占能源消耗和环境影响的重要比例。然而,开放数据源的匮乏对建筑环境研究人员和从业者而言始终是一个持续性挑战。尽管已有多种整合现有开放数据集的尝试,但尚无数据库能够提供包含所有子类别和时间粒度(如年、月、亚小时级)的建筑数据类型综合集合。本文提出建筑数据基因组目录这一开放数据共享平台,作为建筑能源研究关键类别所需数据的一站式门户。该数据目录是一个在线平台(http://buildingdatadirectory.org/),支持数据集的筛选与发现。目录覆盖空间尺度的计量表、建筑级和聚合社区级数据,以及时间尺度从年至分钟不同层级的数据。数据集通过全面探索包括政府、研究机构和在线能源仪表板在内的数据源整合而成。本次工作成果包含60个数据集的聚合,涵盖建筑能源本体、建筑能源模型、建筑能耗与水耗数据、电动汽车数据、气象数据、建筑信息数据、基于文本挖掘的研究数据、建筑图像数据、故障检测诊断数据及居住者数据。平台内置众包机制,用户可通过填写在线表单提交建议纳入的数据集。该目录可推动建筑能效相关研究及应用,这将成为应对全球能源与环境挑战的关键步骤。