Exploratory analysis of high-dimensional data relies on embedding the data into a low-dimensional space (typically 2D or 3D), based on which visualization plot is produced to uncover meaningful structures and to communicate geometric and distributional data characteristics. However, finding a suitable algorithm configuration, particularly hyperparameter setting, to produce a visualization plot that faithfully represents the underlying reality and encourages pattern discovery remains challenging. To address this challenge, we propose an agentic AI pipleline that leverages a large language model (LLM) to bridge the gap between rigorous quantitative assessment and qualitative human insight. By treating visualization evaluation and hyperparameter optimization as a semantic task, our system generates a multi-faceted report that contextualizes hard metrics with descriptive summaries, and suggests actionable recommendation of algorithm configuration for refining data visualization. By implementing an iterative optimization loop of this process, the system is able to produce rapidly a high-quality visualization plot, in full automation.
翻译:高维数据的探索性分析依赖于将数据嵌入低维空间(通常为二维或三维),并据此生成可视化图表以揭示有意义的结构,同时展示数据的几何与分布特征。然而,寻找合适的算法配置(尤其是超参数设置)以生成既忠实反映底层真实结构、又促进模式发现的可视化图表仍具挑战性。针对该问题,我们提出了一种基于智能代理的人工智能流水线,通过利用大语言模型(LLM)弥合严格定量评估与定性人类洞察之间的鸿沟。通过将可视化评估与超参数优化视为语义任务,我们的系统可生成多维度报告,将硬性指标与描述性总结相结合,并提供可操作的可视化算法配置改进建议。通过实施该过程的迭代优化循环,系统能够全自动地快速生成高质量的可视化图表。