In this paper, we present a multimodal dataset obtained from a honey bee colony in Montr\'eal, Quebec, Canada, spanning the years of 2021 to 2022. This apiary comprised 10 beehives, with microphones recording more than 2000 hours of high quality raw audio, and also sensors capturing temperature, and humidity. Periodic hive inspections involved monitoring colony honey bee population changes, assessing queen-related conditions, and documenting overall hive health. Additionally, health metrics, such as Varroa mite infestation rates and winter mortality assessments were recorded, offering valuable insights into factors affecting hive health status and resilience. In this study, we first outline the data collection process, sensor data description, and dataset structure. Furthermore, we demonstrate a practical application of this dataset by extracting various features from the raw audio to predict colony population using the number of frames of bees as a proxy.
翻译:本文介绍了一个从加拿大魁北克省蒙特利尔市的一个蜜蜂群体获取的多模态数据集,时间跨度为2021年至2022年。该养蜂场包含10个蜂箱,麦克风记录了超过2000小时的高质量原始音频,传感器还捕获了温度和湿度数据。定期的蜂箱检查包括监测蜂群数量变化、评估与蜂王相关的状况以及记录蜂群整体健康状况。此外,还记录了健康指标,如瓦螨侵染率和冬季死亡率评估,为影响蜂群健康状况和恢复力的因素提供了有价值的见解。在本研究中,我们首先概述了数据收集过程、传感器数据描述和数据集结构。进一步地,我们通过从原始音频中提取多种特征,以蜜蜂的巢脾数量作为代理指标来预测蜂群数量,展示了该数据集的一个实际应用。