Toward a Machine Bertin: Why Visualization Needs Design Principles for Machine Cognition

Visualization's design knowledge-effectiveness rankings, encoding guidelines, color models, preattentive processing rules -- derives from six decades of psychophysical studies of human vision. Yet vision-language models (VLMs) increasingly consume chart images in automated analysis pipelines, and a growing body of benchmark evidence indicates that this human-centered knowledge base does not straightforwardly transfer to machine audiences. Machines exhibit different encoding performance patterns, process images through patch-based tokenization rather than holistic perception, and fail on design patterns that pose no difficulty for humans-while occasionally succeeding where humans struggle. Current approaches address this gap primarily by bypassing vision entirely, converting charts to data tables or structured text. We argue that this response forecloses a more fundamental question: what visual representations would actually serve machine cognition well? This paper makes the case that the visualization field needs to investigate machine-oriented visual design as a distinct research problem. We synthesize evidence from VLM benchmarks, visual reasoning research, and visualization literacy studies to show that the human-machine perceptual divergence is qualitative, not merely quantitative, and critically examine the prevailing bypassing approach. We propose a conceptual distinction between human-oriented and machine-oriented visualization-not as an engineering architecture but as a recognition that different audiences may require fundamentally different design foundations-and outline a research agenda for developing the empirical foundations the field currently lacks: the beginnings of a "machine Bertin" to complement the human-centered knowledge the field already possesses.

翻译：可视化领域的设计知识——包括有效性排序、编码准则、色彩模型、前注意处理规则等——源自六十年来对人类视觉的心理物理学研究。然而，视觉语言模型（VLMs）在自动化分析流程中日益广泛地处理图表图像，越来越多的基准测试证据表明，这套以人为中心的知识体系并不能直接迁移至机器受众。机器表现出不同的编码性能模式，通过基于图像块的标记化而非整体感知来处理图像，并且在一些对人类毫无困难的设计模式上失败——偶尔却在人类感到困难的场景中成功。当前应对这一差距的主要方法是完全绕过视觉处理，将图表转换为数据表格或结构化文本。我们认为，这种应对方式回避了一个更根本的问题：究竟什么样的视觉表征才能真正服务于机器认知？本文主张，可视化领域需要将面向机器的视觉设计作为一个独立的研究课题进行探索。我们综合了来自VLM基准测试、视觉推理研究和可视化素养研究的证据，表明人机感知差异是质性的而非仅仅是量性的，并对当前主流的“绕过视觉”方法进行了批判性审视。我们提出了面向人类的可视化与面向机器的可视化之间的概念区分——并非作为一种工程架构，而是承认不同受众可能需要根本不同的设计基础——并勾勒了一个研究议程，旨在建立该领域当前缺乏的实证基础：作为对现有以人为中心知识体系的补充，开启构建“机器贝廷”理论体系的初步探索。