基于骨架化的对抗性扰动对大型视觉语言模型数学文本识别的影响 (Skeletonization-Based Adversarial Perturbations on Large Vision Language Model's Mathematical Text Recognition) - 专知论文

会员服务 ·

0

对抗 · 骨架 · 扰动 · 数学 · 识别 ·

Skeletonization-Based Adversarial Perturbations on Large Vision Language Model's Mathematical Text Recognition

翻译：基于骨架化的对抗性扰动对大型视觉语言模型数学文本识别的影响

Masatomo Yoshida,Haruto Namura,Nicola Adami,Masahiro Okuda

from arxiv, accepted to ITC-CSCC 2025

This work explores the visual capabilities and limitations of foundation models by introducing a novel adversarial attack method utilizing skeletonization to reduce the search space effectively. Our approach specifically targets images containing text, particularly mathematical formula images, which are more challenging due to their LaTeX conversion and intricate structure. We conduct a detailed evaluation of both character and semantic changes between original and adversarially perturbed outputs to provide insights into the models' visual interpretation and reasoning abilities. The effectiveness of our method is further demonstrated through its application to ChatGPT, which shows its practical implications in real-world scenarios.

翻译：本研究通过引入一种新颖的对抗性攻击方法，利用骨架化技术有效缩减搜索空间，从而探索基础模型的视觉能力与局限性。该方法专门针对包含文本的图像，特别是数学公式图像——由于需要LaTeX转换且结构复杂，这类图像更具挑战性。我们通过详细评估原始输出与对抗性扰动输出之间的字符级与语义级变化，深入揭示模型的视觉解析与推理能力。该方法在ChatGPT上的应用进一步验证了其有效性，展现了其在现实场景中的实际意义。

0

相关内容

【NeurIPS2022】VICRegL:局部视觉特征的自监督学习

【NeurIPS2022】VICRegL:局部视觉特征的自监督学习

专知会员服务

32+阅读 · 2022年10月6日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【牛津大学Michael Bronstein教授】超越Weisfeiler-Lehman和普通信息传递的图神经网络，Graph Neural Networks beyond Weisfeiler-Lehman and vanilla Message Passing

【牛津大学Michael Bronstein教授】超越Weisfeiler-Lehman和普通信息传递的图神经网络，Graph Neural Networks beyond Weisfeiler-Lehman and vanilla Message Passing

专知会员服务

30+阅读 · 2022年3月4日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

42+阅读 · 2020年4月11日

[CVPR 2021] 序列到序列对比学习的文本识别

[CVPR 2021] 序列到序列对比学习的文本识别

专知

10+阅读 · 2021年4月14日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

30+阅读 · 2018年7月12日

SSD: Single Shot MultiBox Detector 深度学习笔记之SSD物体检测模型

SSD: Single Shot MultiBox Detector 深度学习笔记之SSD物体检测模型

AI研习社

18+阅读 · 2017年8月31日

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

炼数成金订阅号

26+阅读 · 2017年7月10日

基于散射点密度信息熵的层析SAR建筑三维重建新方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

测量误差数据下部分线性模型有约束统计推断理论

国家自然科学基金

2+阅读 · 2015年12月31日

T-S模糊神经网络的容错同步性分析

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

基于上下文感知和异质特征集成的SAR图像分割与评价

国家自然科学基金

2+阅读 · 2015年12月31日

Knowledge-to-Data: LLM-Driven Synthesis of Structured Network Traffic for Testbed-Free IDS Evaluation

Arxiv

0+阅读 · 1月8日

Stability of Constrained Optimization Models for Structured Signal Recovery

Arxiv

0+阅读 · 1月8日

Latent-Level Enhancement with Flow Matching for Robust Automatic Speech Recognition

Arxiv

0+阅读 · 1月8日

The Overlooked Role of Graded Relevance Thresholds in Multilingual Dense Retrieval

Arxiv

0+阅读 · 1月7日

Transparent Semantic Change Detection with Dependency-Based Profiles

Arxiv

0+阅读 · 1月6日

VIP会员

文章信息

相关主题

相关VIP内容

【NeurIPS2022】VICRegL:局部视觉特征的自监督学习

【NeurIPS2022】VICRegL:局部视觉特征的自监督学习

专知会员服务

32+阅读 · 2022年10月6日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【牛津大学Michael Bronstein教授】超越Weisfeiler-Lehman和普通信息传递的图神经网络，Graph Neural Networks beyond Weisfeiler-Lehman and vanilla Message Passing

【牛津大学Michael Bronstein教授】超越Weisfeiler-Lehman和普通信息传递的图神经网络，Graph Neural Networks beyond Weisfeiler-Lehman and vanilla Message Passing

专知会员服务

30+阅读 · 2022年3月4日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

42+阅读 · 2020年4月11日

热门VIP内容

开通专知VIP会员享更多权益服务

新型军备竞赛：美军旨在争夺全球无人机主导地位

《乌克兰的无人机生态系统：经验教训》28页报告

《空中与海上交通工具防除冰技术评估》192页技术报告

《思考蜂群：基础、行为、拓扑与架构、认知、未来之路》400页书籍

相关资讯

[CVPR 2021] 序列到序列对比学习的文本识别

[CVPR 2021] 序列到序列对比学习的文本识别

专知

10+阅读 · 2021年4月14日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

30+阅读 · 2018年7月12日

SSD: Single Shot MultiBox Detector 深度学习笔记之SSD物体检测模型

SSD: Single Shot MultiBox Detector 深度学习笔记之SSD物体检测模型

AI研习社

18+阅读 · 2017年8月31日

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

炼数成金订阅号

26+阅读 · 2017年7月10日

相关论文

Knowledge-to-Data: LLM-Driven Synthesis of Structured Network Traffic for Testbed-Free IDS Evaluation

Arxiv

0+阅读 · 1月8日

Stability of Constrained Optimization Models for Structured Signal Recovery

Arxiv

0+阅读 · 1月8日

Latent-Level Enhancement with Flow Matching for Robust Automatic Speech Recognition

Arxiv

0+阅读 · 1月8日

The Overlooked Role of Graded Relevance Thresholds in Multilingual Dense Retrieval

Arxiv

0+阅读 · 1月7日

Transparent Semantic Change Detection with Dependency-Based Profiles

Arxiv

0+阅读 · 1月6日

相关基金

基于散射点密度信息熵的层析SAR建筑三维重建新方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

测量误差数据下部分线性模型有约束统计推断理论

国家自然科学基金

2+阅读 · 2015年12月31日

T-S模糊神经网络的容错同步性分析

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

基于上下文感知和异质特征集成的SAR图像分割与评价

国家自然科学基金

2+阅读 · 2015年12月31日

微信扫码咨询专知VIP会员