提示注入攻击论文 - 专知

会员服务 ·

提示注入攻击

提示注入攻击

Defending against Adaptive Prompt Injection Attacks via Reasoning-enabled Task Alignment

Arxiv

0+阅读 · 6月13日

From ASR to ASP: Evaluating Prompt Attack Vulnerabilities Against Open-Source LLMs

Arxiv

0+阅读 · 6月14日

MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

Arxiv

0+阅读 · 6月14日

It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents

Arxiv

0+阅读 · 6月4日

Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

Arxiv

0+阅读 · 5月28日

Assessing Automated Prompt Injection Attacks in Agentic Environments

Arxiv

0+阅读 · 6月9日

Investigating Detection and Obfuscation of Prompt Injection Attacks Against Software Reverse Engineering AI Agents

Arxiv

0+阅读 · 5月29日

GitInject: Real-World Prompt Injection Attacks in AI-Powered CI/CD Pipelines

Arxiv

0+阅读 · 6月7日

PI-Hunter: Automated Red-Teaming for Exposing and Localizing Prompt Injections

Arxiv

0+阅读 · 6月10日

Misleading Large Language Models used (or misused) in Scientific Peer-Reviewing via Hidden Prompt-Injection Attacks

Arxiv

0+阅读 · 3月30日

Prompt Injection as Role Confusion

Arxiv

0+阅读 · 3月20日

Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks

Arxiv

0+阅读 · 3月31日

AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations

Arxiv

0+阅读 · 3月11日

AlignSentinel: Alignment-Aware Detection of Prompt Injection Attacks

Arxiv

0+阅读 · 2月21日

Bypassing AI Control Protocols via Agent-as-a-Proxy Attacks

Arxiv

0+阅读 · 2月25日

参考链接

微信扫码咨询专知VIP会员