This study aimed to comprehend how user domain knowledge and artificial intelligence (AI) literacy impact the effective use of human-AI interactive building energy management system (BEMS). While prior studies have investigated the potential of integrating large language models (LLMs) into BEMS or building energy modeling, very few studies have examined how user interact with such systems. We conducted a systematic role-playing experiment, where 85 human subjects interacted with an advanced generative pre-trained transformer (OpenAI GPT-4o). Participants were tasked with identifying the top five behavioral changes that could reduce home energy use with the GPT model that functioned as an LLM-integrated BEMS. Then, the collected prompt-response data and participant conclusions were analyzed using an analytical framework that hierarchically assessed and scored human-AI interactions and their home energy analysis approaches. Also, participants were classified into four groups based on their self-evaluated domain knowledge of building energy use and AI literacy, and Kruskal-Wallis H tests with post-hoc pairwise comparisons were conducted across 20 quantifiable metrics. Key takeaways include: most participants employed concise prompts (median: 16.2 words) and relied heavily on GPT's analytical capabilities; and notably, only 1 of 20 metrics, appliance identification rate, showed statistically significant group differences (p=0.037), driven by AI literacy rather than domain knowledge, suggesting an equalizing effect of LLMs across expertise levels. This study provides foundational insights into human-AI collaboration dynamics and promising development directions in the context of LLM-integrated BEMS and contributes to realizing human-centric LLM-integrated energy systems.
翻译:本研究旨在理解用户领域知识和人工智能(AI)素养如何影响人机交互式建筑能源管理系统(BEMS)的有效使用。尽管先前的研究已探讨了将大语言模型(LLMs)集成到BEMS或建筑能源建模中的潜力,但极少有研究考察用户如何与此类系统交互。我们进行了一项系统性的角色扮演实验,其中85名人类受试者与一个先进的生成式预训练Transformer模型(OpenAI GPT-4o)进行交互。参与者的任务是利用作为LLM集成BEMS运行的GPT模型,识别出能够减少家庭能源使用的前五项行为改变。随后,使用一个分层评估和评分人机交互及其家庭能源分析方法的分析框架,对收集到的提示-响应数据和参与者结论进行了分析。此外,根据参与者自我评估的建筑能源使用领域知识和AI素养,将其分为四组,并在20个可量化指标上进行了Kruskal-Wallis H检验及事后成对比较。主要发现包括:大多数参与者使用了简洁的提示(中位数:16.2个词)并高度依赖GPT的分析能力;值得注意的是,在20个指标中,仅有一项指标——电器识别率——显示出统计学上显著的组间差异(p=0.037),且这种差异由AI素养而非领域知识驱动,这表明LLMs在不同专业知识水平用户间起到了均衡化作用。本研究为理解人机协作动态提供了基础性见解,并为LLM集成BEMS背景下的未来发展指明了有前景的方向,有助于实现以人为本的LLM集成能源系统。