提示注入论文 - 专知

会员服务 ·

提示注入

KVEraser: Learning to Steer KV Cache for Efficient Localized Context Erasing

Arxiv

0+阅读 · 6月15日

Defending against Adaptive Prompt Injection Attacks via Reasoning-enabled Task Alignment

Arxiv

0+阅读 · 6月13日

Send a SCOUT First: Pre-hoc Reasoning for Adaptive Detector Allocation in Prompt-Injection Defense

Arxiv

0+阅读 · 6月14日

From ASR to ASP: Evaluating Prompt Attack Vulnerabilities Against Open-Source LLMs

Arxiv

0+阅读 · 6月14日

Cordyceps: Covert Control Attacks on LLMs via Data Poisoning

Arxiv

0+阅读 · 6月15日

MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

Arxiv

0+阅读 · 6月14日

AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations

Arxiv

0+阅读 · 6月10日

It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents

Arxiv

0+阅读 · 6月4日

Device Context Protocol: A Compact, Safety-First Architecture for LLM-Driven Control of Constrained Devices

Arxiv

0+阅读 · 5月24日

No Hidden Prompts Needed! You Can Game AI Peer Review with Presentation-Only Revisions

Arxiv

0+阅读 · 6月11日

MalSkillBench: A Runtime-Verified Benchmark of Malicious Agent Skills

Arxiv

0+阅读 · 6月9日

Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

Arxiv

0+阅读 · 5月28日

Assessing Automated Prompt Injection Attacks in Agentic Environments

Arxiv

0+阅读 · 6月9日

Investigating Detection and Obfuscation of Prompt Injection Attacks Against Software Reverse Engineering AI Agents

Arxiv

0+阅读 · 5月29日

GitInject: Real-World Prompt Injection Attacks in AI-Powered CI/CD Pipelines

Arxiv

0+阅读 · 6月7日

参考链接

微信扫码咨询专知VIP会员