Recent digitisation efforts in natural history museums have produced large volumes of collection data, yet their scale and scientific complexity often hinder public access and understanding. Conventional data management tools, such as databases, restrict exploration through keyword-based search or require specialised schema knowledge. This paper presents a system design that uses conversational AI to query nearly 1.7 million digitised specimen records from the life-science collections of the Australian Museum. Designed and developed through a human-centred design process, the system contains an interactive map for visual-spatial exploration and a natural-language conversational agent that retrieves detailed specimen data and answers collection-specific questions. The system leverages function-calling capabilities of contemporary large language models to dynamically retrieve structured data from external APIs, enabling fast, real-time interaction with extensive yet frequently updated datasets. Our work provides a new approach of connecting large museum collections with natural language-based queries and informs future designs of scientific AI agents for natural history museums.
翻译:近年来,自然历史博物馆的数字化工作已产生海量馆藏数据,但其规模与科学复杂性往往阻碍了公众的访问与理解。传统的数据管理工具(如数据库)通过基于关键词的搜索限制探索过程,或要求使用者具备专业的模式知识。本文提出一种系统设计,利用对话式人工智能查询澳大利亚博物馆生命科学馆藏中近170万份数字化标本记录。该系统通过以人为中心的设计流程进行设计与开发,包含用于视觉空间探索的交互式地图,以及可检索详细标本数据并回答馆藏特定问题的自然语言对话智能体。该系统利用当代大语言模型的函数调用能力,从外部API动态检索结构化数据,实现了与体量庞大且频繁更新的数据集进行快速、实时的交互。我们的工作为通过自然语言查询连接大型博物馆馆藏提供了新途径,并为自然历史博物馆科学人工智能智能体的未来设计提供了参考。