Large language models can already query databases, yet most existing systems remain reactive: they rely on explicit user prompts and do not actively explore data. We introduce DAR (Data Agnostic Researcher), a multi-agent system that performs end-to-end database research without human-initiated queries. DAR orchestrates specialized AI agents across three layers: initialization (intent inference and metadata extraction), execution (SQL and AI-based query synthesis with iterative validation), and synthesis (report generation with built-in quality control). All reasoning is executed directly inside BigQuery using native generative AI functions, eliminating data movement and preserving data governance. On a realistic asset-incident dataset, DAR completes the full analytical task in 16 minutes, compared to 8.5 hours for a professional analyst (approximately 32x times faster), while producing useful pattern-based insights and evidence-grounded recommendations. Although human experts continue to offer deeper contextual interpretation, DAR excels at rapid exploratory analysis. Overall, this work shifts database interaction from query-driven assistance toward autonomous, research-driven exploration within cloud data warehouses.
翻译:大型语言模型已能查询数据库,但现有系统大多仍是被动的:它们依赖明确的用户提示,且不会主动探索数据。我们提出DAR(数据无关研究者),这是一种无需人工发起查询即可执行端到端数据库研究的多智能体系统。DAR在三个层级协调专用AI智能体:初始化(意图推断与元数据提取)、执行(基于SQL与AI的查询合成及迭代验证)以及综合(内置质量控制的报告生成)。所有推理均通过原生生成式AI函数直接在BigQuery中执行,从而消除数据移动并保持数据治理。在真实的资产-事件数据集上,DAR在16分钟内完成完整分析任务,而专业分析师需要8.5小时(约快32倍),同时生成基于模式的有用见解和证据支撑的建议。尽管人类专家仍能提供更深入的语境解释,但DAR在快速探索性分析方面表现卓越。总体而言,这项工作将数据库交互从查询驱动的辅助转向云数据仓库内自主、研究驱动的探索。