沙盒论文 - 专知

会员服务 ·

Greed Is Learned: Visible Incentives as Reward-Hacking Triggers

Arxiv

0+阅读 · 6月15日

TABX: A High-Throughput Sandbox Battle Simulator for Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 5月27日

Context: Proactive Goal-Directed Intelligence via Composable Sandboxed Programs, Declarative Wiring, and Structured Interaction

Arxiv

0+阅读 · 4月21日

YeasierAgent: Agentic Social Sandbox as a Canvas for Intent-Driven Creation of Platform-Agnostic Symbiotic Agent-Native Applications

Arxiv

0+阅读 · 6月11日

Bathtubs, Boundaries, and Sandboxes: AI Regulatory Learning under Legal Uncertainty

Arxiv

0+阅读 · 6月8日

Governed Capability Evolution for Embodied Agents: Safe Upgrade, Compatibility Checking, and Runtime Rollback for Embodied Capability Modules

Arxiv

0+阅读 · 4月9日

ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents

Arxiv

0+阅读 · 3月19日

Splits! Flexible Sociocultural Linguistic Investigation at Scale

Arxiv

0+阅读 · 4月9日

Agent-Diff: Benchmarking LLM Agents on Enterprise API Tasks via Code Execution with State-Diff-Based Evaluation

Arxiv

0+阅读 · 2月11日

Among Us: A Sandbox for Measuring and Detecting Agentic Deception

Arxiv

0+阅读 · 2月10日

AgentCgroup: Understanding and Controlling OS Resources of AI Agents

Arxiv

0+阅读 · 2月10日

LLM-in-Sandbox Elicits General Agentic Intelligence

Arxiv

0+阅读 · 2月12日

TABX: A High-Throughput Sandbox Battle Simulator for Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2月2日

SandCell: Sandboxing Rust Beyond Unsafe Code

Arxiv

0+阅读 · 1月18日

Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities

Arxiv

0+阅读 · 1月29日

参考链接

微信扫码咨询专知VIP会员