GPT-5论文 - 专知

会员服务 ·

GPT-5

The African Language Tax: Quantifying the Cost, Latency, and Context Penalty of Tokenizing African Languages in Frontier LLMs

The African Language Tax: Quantifying the Cost, Latency, and Context Penalty of Tokenizing African Languages in Frontier LLMs

Arxiv

0+阅读 · 6月23日

Who Checks the Citations? Benchmarking Legal Hallucination Detection

Arxiv

0+阅读 · 6月19日

Confident and Wrong: Silent Semantic Failures in Coding Agents

Arxiv

0+阅读 · 6月21日

Do We Still Need Humans in the Loop? Comparing Human and LLM Annotation in Active Learning for Hostility Detection

Arxiv

0+阅读 · 6月16日

ACCORD: Action-Conditioned Contextual Grounding for Language Agents

Arxiv

0+阅读 · 6月15日

HyDRA: Hybrid Dynamic Routing Architecture for Heterogeneous LLM Pools

Arxiv

0+阅读 · 6月12日

PerspectiveGap: A Benchmark for Multi-Agent Orchestration Prompting

Arxiv

0+阅读 · 6月7日

SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios

Arxiv

0+阅读 · 5月22日

Towards Retrieving Interaction Spaces for Agentic Search

Arxiv

0+阅读 · 6月5日

Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis

Arxiv

0+阅读 · 5月12日

Can LLMs Perform Synthesis?

Arxiv

0+阅读 · 3月13日

Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents

Arxiv

0+阅读 · 4月3日

Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents

Arxiv

0+阅读 · 4月7日

Can "AI" Be a Doctor? A Study of Empathy, Readability, and Alignment in Clinical LLMs

Arxiv

0+阅读 · 4月22日

Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind

Arxiv

0+阅读 · 4月13日

参考链接

微信扫码咨询专知VIP会员