Vibe Coding XR: Accelerating AI + XR Prototyping with XR Blocks and Gemini - 专知论文

会员服务 ·

0

代码 · prototype · AI · 块 · MoDELS ·

Vibe Coding XR: Accelerating AI + XR Prototyping with XR Blocks and Gemini

翻译：暂无翻译

Ruofei Du,Benjamin Hersh,David Li,Nels Numan,Xun Qian,Yanhe Chen,Zhongyi Zhou,Xingyue Chen,Jiahao Ren,Robert Timothy Bettridge,Xiang 'Anthony' Chen,Faraz Faruqi,Steve Toh,David Kim

While large language models (LLMs) have accelerated 2D software development through intent-driven "vibe coding", prototyping intelligent Extended Reality (XR) experiences remains a major challenge. The fundamental barrier is not just the steep learning curve for human creators, but that low-level sensor APIs and complex game engine hierarchies are ill-suited for LLM reasoning, routinely exceeding context windows and inducing syntax hallucinations. To bridge this gap, we contribute XR Blocks, an open-source, LLM-native WebXR framework. Unlike traditional engines, XR Blocks introduces a semantic "Reality Model" that aligns spatial computing primitives (users, physical environments, and agents) with natural language, providing a robust, concise vocabulary optimized for generative AI. Building upon this foundation, we present Vibe Coding XR, an end-to-end prototyping workflow that leverages LLMs to translate high-level prompts (e.g., "create a dandelion that reacts to my hand") directly into functional, physics-aware mixed-reality applications. To minimize the friction of on-device testing, the workflow introduces a seamless desktop "simulated reality" to headset deployment loop. Finally, we introduce VCXR60, a pilot dataset of 60 XR prompts paired with an automated evaluation pipeline. Our technical evaluation demonstrates high one-shot execution success, enabling practitioners to bypass lowlevel hurdles and rapidly move from "idea to reality". Code and live demos are available at https://github.com/google/xrblocks and http://xrblocks.github.io/gem.

翻译：暂无翻译

0

相关内容

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

利用 Gemini 加速科学研究：案例研究与常用技术

利用 Gemini 加速科学研究：案例研究与常用技术

专知会员服务

17+阅读 · 3月25日

关于 GPT-5.2、Gemini 3 Pro、Qwen3-VL、豆包 1.8、Grok 4.1 Fast、Nano Banana Pro 及 Seedream 4.5 的安全性研究报告

关于 GPT-5.2、Gemini 3 Pro、Qwen3-VL、豆包 1.8、Grok 4.1 Fast、Nano Banana Pro 及 Seedream 4.5 的安全性研究报告

专知会员服务

25+阅读 · 1月18日

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

专知会员服务

43+阅读 · 1月8日

从代码基础模型到智能体与应用：代码智能的全面综述与实践指南

从代码基础模型到智能体与应用：代码智能的全面综述与实践指南

专知会员服务

29+阅读 · 2025年12月13日

【ICML2025】使用树搜索重新排序推理上下文，使大型视觉语言模型更强大

【ICML2025】使用树搜索重新排序推理上下文，使大型视觉语言模型更强大

专知会员服务

7+阅读 · 2025年6月10日

DeepSeek模型综述：V1 V2 V3 R1-Zero

DeepSeek模型综述：V1 V2 V3 R1-Zero

专知会员服务

116+阅读 · 2025年2月11日

《大语言模型推理加速》全面的硬件视角

《大语言模型推理加速》全面的硬件视角

专知会员服务

34+阅读 · 2024年10月12日

大模型篇：大模型发展迈入爆发期，开启AI新纪元

大模型篇：大模型发展迈入爆发期，开启AI新纪元

专知会员服务

69+阅读 · 2024年8月16日

WSDM 2024| LLMs助力图学习？基于大模型的图数据增强

WSDM 2024| LLMs助力图学习？基于大模型的图数据增强

专知会员服务

27+阅读 · 2023年11月19日

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

专知会员服务

13+阅读 · 2022年3月12日

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

专知

36+阅读 · 2019年9月29日

【泡泡图灵智库】多传感器深度连续融合的三维目标检测方法

【泡泡图灵智库】多传感器深度连续融合的三维目标检测方法

泡泡机器人SLAM

23+阅读 · 2019年9月7日

ICLR 2019计算机视觉、NLP、图模型、对抗学习、表示学习和元学习最新技术分享

ICLR 2019计算机视觉、NLP、图模型、对抗学习、表示学习和元学习最新技术分享

深度学习与NLP

17+阅读 · 2019年6月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

谷歌升级版Transformer官方解读：更大、更强，解决长文本问题（开源）

谷歌升级版Transformer官方解读：更大、更强，解决长文本问题（开源）

新智元

19+阅读 · 2019年1月30日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

BERT 现已开源：最先进的 NLP 预训练技术，支持中文和更多语言

BERT 现已开源：最先进的 NLP 预训练技术，支持中文和更多语言

谷歌开发者

16+阅读 · 2018年11月6日

Word2Vec —— 深度学习的一小步，自然语言处理的一大步

Word2Vec —— 深度学习的一小步，自然语言处理的一大步

AI研习社

21+阅读 · 2018年6月14日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

面向10Tb/in2级磁存储系统的二维LDPC码设计

国家自然科学基金

0+阅读 · 2015年12月31日

面向智能穿戴设备的三维图形网格简化与渐进显示方法

国家自然科学基金

1+阅读 · 2015年12月31日

基于人类3D视觉感应的2D到3D视频转换关键技术研究

国家自然科学基金

2+阅读 · 2015年12月31日

多约束协同的彩色夜视影像亚像素超分辨率重建

国家自然科学基金

1+阅读 · 2015年12月31日

联机手写维吾尔文基础数据库及识别方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

三维片上网络芯片关键设计技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

三维连续集成集成电路关键工艺技术和机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于TSV的3D芯片“绑定中测试”关键问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

维吾尔语单元集优化关键技术研究及其在语音识别中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

可重构的环境自适应RS码软判决译码器研究

国家自然科学基金

0+阅读 · 2014年12月31日

Vibe Coding in Product Teams: Reconfiguring AI-Assisted Workflows, Prototyping, and Collaboration

Arxiv

0+阅读 · 5月1日

CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding

Arxiv

0+阅读 · 4月28日

Large Language Model-Enhanced Relational Operators: Taxonomy, Benchmark, and Analysis

Arxiv

0+阅读 · 4月28日

E2Edev: Benchmarking Large Language Models in End-to-End Software Development Task

Arxiv

0+阅读 · 4月9日

Do AI Models Dream of Faster Code? An Empirical Study on LLM-Proposed Performance Improvements in Real-World Software

Arxiv

0+阅读 · 4月9日

VisCoder2: Building Multi-Language Visualization Coding Agents

Arxiv

0+阅读 · 4月7日

VibeGuard: A Security Gate Framework for AI-Generated Code

Arxiv

0+阅读 · 4月1日

Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering

Arxiv

0+阅读 · 3月27日

Throughput Optimization as a Strategic Lever in Large-Scale AI Systems: Evidence from Dataloader and Memory Profiling Innovations

Arxiv

0+阅读 · 3月27日

Understanding and Optimizing Multi-Stage AI Inference Pipelines

Arxiv

0+阅读 · 3月20日

VIP会员

文章信息

相关主题

最新内容

DeepSeek 版Claude Code，免费小白安装教程来了！

DeepSeek 版Claude Code，免费小白安装教程来了！

专知会员服务

9+阅读 · 5月5日

【ICML Spotlight 2026】 T²PO: 不确定性引导的探索控制框架，实现稳定多轮Agentic强化学习

【ICML Spotlight 2026】 T²PO: 不确定性引导的探索控制框架，实现稳定多轮Agentic强化学习

专知会员服务

5+阅读 · 5月5日

基础模型驱动的工业智能体：技术成熟度、能力变迁与未竟之挑战

基础模型驱动的工业智能体：技术成熟度、能力变迁与未竟之挑战

专知会员服务

6+阅读 · 5月5日

《机动炮兵的演进与未来：技术进步、历史沿革与炮兵作战前瞻》

《机动炮兵的演进与未来：技术进步、历史沿革与炮兵作战前瞻》

专知会员服务

7+阅读 · 5月5日

《火炮弹药快速效能建模：提升互操作性与技术优势》（报告）

《火炮弹药快速效能建模：提升互操作性与技术优势》（报告）

专知会员服务

9+阅读 · 5月5日

《美空军条令出版物 2-0：情报（2026版）》

《美空军条令出版物 2-0：情报（2026版）》

专知会员服务

14+阅读 · 5月5日

美陆军“飞蝇陷阱5.0”项目将新兴技术交到作战人员手中

美陆军“飞蝇陷阱5.0”项目将新兴技术交到作战人员手中

专知会员服务

6+阅读 · 5月5日

帕兰提尔 Gotham：一个游戏规则改变器

帕兰提尔 Gotham：一个游戏规则改变器

专知会员服务

9+阅读 · 5月5日

【ICML 2026】用测试时训练线性化视觉Transformer：T⁵ 实现 Softmax 注意力到线性复杂度的快速转换

【ICML 2026】用测试时训练线性化视觉Transformer：T⁵ 实现 Softmax 注意力到线性复杂度的快速转换

专知会员服务

3+阅读 · 5月5日

【AAAI 2026】大模型做知识蒸馏：CMM将LLM特征拆解给小模型协同学习

【AAAI 2026】大模型做知识蒸馏：CMM将LLM特征拆解给小模型协同学习

专知会员服务

3+阅读 · 5月5日

【ICML Spotlight 2026 】NonZero：交互引导探索的多智能体蒙特卡洛树搜索

【ICML Spotlight 2026 】NonZero：交互引导探索的多智能体蒙特卡洛树搜索

专知会员服务

8+阅读 · 5月4日

【综述】机器人学习中的世界模型：全面综述

【综述】机器人学习中的世界模型：全面综述

专知会员服务

13+阅读 · 5月4日

伊朗的导弹-无人机行动及其对美国威慑的影响

伊朗的导弹-无人机行动及其对美国威慑的影响

专知会员服务

9+阅读 · 5月4日

《未来战术无人机系统案例研究：量身定制采办策略方法》100页报告

《未来战术无人机系统案例研究：量身定制采办策略方法》100页报告

专知会员服务

10+阅读 · 5月4日

战争贩子：2026年第一季度美国对中东潜在军售激增

战争贩子：2026年第一季度美国对中东潜在军售激增

专知会员服务

7+阅读 · 5月4日

相关VIP内容

利用 Gemini 加速科学研究：案例研究与常用技术

利用 Gemini 加速科学研究：案例研究与常用技术

专知会员服务

17+阅读 · 3月25日

关于 GPT-5.2、Gemini 3 Pro、Qwen3-VL、豆包 1.8、Grok 4.1 Fast、Nano Banana Pro 及 Seedream 4.5 的安全性研究报告

关于 GPT-5.2、Gemini 3 Pro、Qwen3-VL、豆包 1.8、Grok 4.1 Fast、Nano Banana Pro 及 Seedream 4.5 的安全性研究报告

专知会员服务

25+阅读 · 1月18日

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

专知会员服务

43+阅读 · 1月8日

从代码基础模型到智能体与应用：代码智能的全面综述与实践指南

从代码基础模型到智能体与应用：代码智能的全面综述与实践指南

专知会员服务

29+阅读 · 2025年12月13日

【ICML2025】使用树搜索重新排序推理上下文，使大型视觉语言模型更强大

【ICML2025】使用树搜索重新排序推理上下文，使大型视觉语言模型更强大

专知会员服务

7+阅读 · 2025年6月10日

DeepSeek模型综述：V1 V2 V3 R1-Zero

DeepSeek模型综述：V1 V2 V3 R1-Zero

专知会员服务

116+阅读 · 2025年2月11日

《大语言模型推理加速》全面的硬件视角

《大语言模型推理加速》全面的硬件视角

专知会员服务

34+阅读 · 2024年10月12日

大模型篇：大模型发展迈入爆发期，开启AI新纪元

大模型篇：大模型发展迈入爆发期，开启AI新纪元

专知会员服务

69+阅读 · 2024年8月16日

WSDM 2024| LLMs助力图学习？基于大模型的图数据增强

WSDM 2024| LLMs助力图学习？基于大模型的图数据增强

专知会员服务

27+阅读 · 2023年11月19日

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

专知会员服务

13+阅读 · 2022年3月12日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICML Spotlight 2026】 T²PO: 不确定性引导的探索控制框架，实现稳定多轮Agentic强化学习

《机动炮兵的演进与未来：技术进步、历史沿革与炮兵作战前瞻》

DeepSeek 版Claude Code，免费小白安装教程来了！

基础模型驱动的工业智能体：技术成熟度、能力变迁与未竟之挑战

相关资讯

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

专知

36+阅读 · 2019年9月29日

【泡泡图灵智库】多传感器深度连续融合的三维目标检测方法

【泡泡图灵智库】多传感器深度连续融合的三维目标检测方法

泡泡机器人SLAM

23+阅读 · 2019年9月7日

ICLR 2019计算机视觉、NLP、图模型、对抗学习、表示学习和元学习最新技术分享

ICLR 2019计算机视觉、NLP、图模型、对抗学习、表示学习和元学习最新技术分享

深度学习与NLP

17+阅读 · 2019年6月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

谷歌升级版Transformer官方解读：更大、更强，解决长文本问题（开源）

谷歌升级版Transformer官方解读：更大、更强，解决长文本问题（开源）

新智元

19+阅读 · 2019年1月30日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

BERT 现已开源：最先进的 NLP 预训练技术，支持中文和更多语言

BERT 现已开源：最先进的 NLP 预训练技术，支持中文和更多语言

谷歌开发者

16+阅读 · 2018年11月6日

Word2Vec —— 深度学习的一小步，自然语言处理的一大步

Word2Vec —— 深度学习的一小步，自然语言处理的一大步

AI研习社

21+阅读 · 2018年6月14日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Vibe Coding in Product Teams: Reconfiguring AI-Assisted Workflows, Prototyping, and Collaboration

Arxiv

0+阅读 · 5月1日

CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding

Arxiv

0+阅读 · 4月28日

Large Language Model-Enhanced Relational Operators: Taxonomy, Benchmark, and Analysis

Arxiv

0+阅读 · 4月28日

E2Edev: Benchmarking Large Language Models in End-to-End Software Development Task

Arxiv

0+阅读 · 4月9日

Do AI Models Dream of Faster Code? An Empirical Study on LLM-Proposed Performance Improvements in Real-World Software

Arxiv

0+阅读 · 4月9日

VisCoder2: Building Multi-Language Visualization Coding Agents

Arxiv

0+阅读 · 4月7日

VibeGuard: A Security Gate Framework for AI-Generated Code

Arxiv

0+阅读 · 4月1日

Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering

Arxiv

0+阅读 · 3月27日

Throughput Optimization as a Strategic Lever in Large-Scale AI Systems: Evidence from Dataloader and Memory Profiling Innovations

Arxiv

0+阅读 · 3月27日

Understanding and Optimizing Multi-Stage AI Inference Pipelines

Arxiv

0+阅读 · 3月20日

相关基金

面向10Tb/in2级磁存储系统的二维LDPC码设计

国家自然科学基金

0+阅读 · 2015年12月31日

面向智能穿戴设备的三维图形网格简化与渐进显示方法

国家自然科学基金

1+阅读 · 2015年12月31日

基于人类3D视觉感应的2D到3D视频转换关键技术研究

国家自然科学基金

2+阅读 · 2015年12月31日

多约束协同的彩色夜视影像亚像素超分辨率重建

国家自然科学基金

1+阅读 · 2015年12月31日

联机手写维吾尔文基础数据库及识别方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

三维片上网络芯片关键设计技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

三维连续集成集成电路关键工艺技术和机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于TSV的3D芯片“绑定中测试”关键问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

维吾尔语单元集优化关键技术研究及其在语音识别中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

可重构的环境自适应RS码软判决译码器研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员