Genomics data is essential in biological and medical domains, and bioinformatics analysts often manually create circos plots to analyze the data and extract valuable insights. However, creating circos plots is complex, as it requires careful design for multiple track attributes and positional relationships between them. Typically, analysts often seek inspiration from existing circos plots, and they have to iteratively adjust and refine the plot to achieve a satisfactory final design, making the process both tedious and time-intensive. To address these challenges, we propose IntelliCircos, an AI-powered interactive authoring tool that streamlines the process from initial visual design to the final implementation of circos plots. Specifically, we build a new dataset containing 4396 circos plots with corresponding annotations and configurations, which are extracted and labeled from published papers. With the dataset, we further identify track combination patterns, and utilize Large Language Model (LLM) to provide domain-specific design recommendations and configuration references to navigate the design of circos plots. We conduct a user study with 8 bioinformatics analysts to evaluate IntelliCircos, and the results demonstrate its usability and effectiveness in authoring circos plots.
翻译:基因组学数据在生物学与医学领域至关重要,生物信息学分析师通常需要手动创建Circos图来分析数据并提取有价值的洞见。然而,创建Circos图的过程十分复杂,需要对多个轨道属性及其间的空间关系进行精心设计。通常,分析师会从现有的Circos图中寻找灵感,并需要反复调整与优化图表才能获得满意的最终设计,这使得整个过程既繁琐又耗时。为应对这些挑战,我们提出了IntelliCircos——一种人工智能赋能的交互式创作工具,它能够简化从初始视觉设计到Circos图最终实现的整个流程。具体而言,我们构建了一个包含4396个Circos图的新数据集,其中每个图表均配有相应的标注与配置参数,这些数据均从已发表的论文中提取并标注完成。基于该数据集,我们进一步识别了轨道组合模式,并利用大型语言模型(LLM)提供领域特定的设计建议与配置参考,以引导Circos图的设计过程。我们邀请了8位生物信息学分析师开展用户研究以评估IntelliCircos,结果表明该工具在创作Circos图方面具有良好的可用性与有效性。