Automatic chart to text summarization is an effective tool for the visually impaired people along with providing precise insights of tabular data in natural language to the user. A large and well-structured dataset is always a key part for data driven models. In this paper, we propose ChartSumm: a large-scale benchmark dataset consisting of a total of 84,363 charts along with their metadata and descriptions covering a wide range of topics and chart types to generate short and long summaries. Extensive experiments with strong baseline models show that even though these models generate fluent and informative summaries by achieving decent scores in various automatic evaluation metrics, they often face issues like suffering from hallucination, missing out important data points, in addition to incorrect explanation of complex trends in the charts. We also investigated the potential of expanding ChartSumm to other languages using automated translation tools. These make our dataset a challenging benchmark for future research.
翻译:自动图表到文本摘要技术能够为视障人士提供有效辅助,同时以自然语言形式向用户呈现表格数据的精确洞察。大规模且结构完善的数据集始终是数据驱动模型的关键要素。本文提出ChartSumm:一个大规模基准数据集,涵盖84,363张图表及其元数据和描述,涉及广泛的主题与图表类型,用于生成短摘要和长摘要。基于强基线模型的大量实验表明,尽管这些模型在各种自动评估指标上取得了可观分数,能够生成流畅且信息丰富的摘要,但仍常面临幻觉问题、遗漏关键数据点以及对图表中复杂趋势的错误解释等挑战。我们还研究了利用自动翻译工具将ChartSumm扩展至其他语言的可行性。这些特性使我们的数据集成为未来研究领域一个具有挑战性的基准。