基于3D高斯与扩散模型的视线重定向 (3D Gaussian and Diffusion-Based Gaze Redirection) - 专知论文

会员服务 ·

0

3D · 合成 · 高保真 · 最优 · 视线角 ·

2025 年 11 月 14 日

3D Gaussian and Diffusion-Based Gaze Redirection

翻译：基于3D高斯与扩散模型的视线重定向

Abiram Panchalingam,Indu Bodala,Stuart Middleton

High-fidelity gaze redirection is critical for generating augmented data to improve the generalization of gaze estimators. 3D Gaussian Splatting (3DGS) models like GazeGaussian represent the state-of-the-art but can struggle with rendering subtle, continuous gaze shifts. In this paper, we propose DiT-Gaze, a framework that enhances 3D gaze redirection models using a novel combination of Diffusion Transformer (DiT), weak supervision across gaze angles, and an orthogonality constraint loss. DiT allows higher-fidelity image synthesis, while our weak supervision strategy using synthetically generated intermediate gaze angles provides a smooth manifold of gaze directions during training. The orthogonality constraint loss mathematically enforces the disentanglement of internal representations for gaze, head pose, and expression. Comprehensive experiments show that DiT-Gaze sets a new state-of-the-art in both perceptual quality and redirection accuracy, reducing the state-of-the-art gaze error by 4.1% to 6.353 degrees, providing a superior method for creating synthetic training data. Our code and models will be made available for the research community to benchmark against.

翻译：高保真视线重定向对于生成增强数据以提升视线估计器的泛化能力至关重要。诸如GazeGaussian等3D高斯溅射（3DGS）模型代表了当前最优技术，但在渲染细微、连续的视线偏移时仍面临挑战。本文提出DiT-Gaze框架，该框架通过创新性地结合扩散变换器（DiT）、跨视线角度的弱监督以及正交约束损失，增强了3D视线重定向模型的性能。DiT实现了更高保真度的图像合成，而采用合成生成中间视线角度的弱监督策略则在训练过程中提供了平滑的视线方向流形。正交约束损失从数学上强制解耦了视线、头部姿态与表情的内部表征。综合实验表明，DiT-Gaze在感知质量与重定向精度上均创造了新的最优水平，将当前最佳视线误差降低了4.1%至6.353度，为创建合成训练数据提供了更优方法。我们的代码与模型将向研究社区公开，以支持基准测试。

0

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

【ICML2025】GCAL：使图模型适应不断演变的领域偏移

【ICML2025】GCAL：使图模型适应不断演变的领域偏移

专知会员服务

9+阅读 · 2025年5月23日

【NeurIPS2024】几何轨迹扩散模型

【NeurIPS2024】几何轨迹扩散模型

专知会员服务

24+阅读 · 2024年10月20日

【NeurIPS 2024 Oral】用于多条件分子生成的图扩散Transformer

【NeurIPS 2024 Oral】用于多条件分子生成的图扩散Transformer

专知会员服务

16+阅读 · 2024年10月5日

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

专知会员服务

24+阅读 · 2023年5月10日

【AAAI2023】面向领域自适应语义分割的几何感知网络

【AAAI2023】面向领域自适应语义分割的几何感知网络

专知会员服务

21+阅读 · 2022年12月7日

斯坦福MIT-CMU【NeurIPS 2022】条件GANs和扩散模型的有效空间稀疏推断

斯坦福MIT-CMU【NeurIPS 2022】条件GANs和扩散模型的有效空间稀疏推断

专知会员服务

26+阅读 · 2022年11月5日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知会员服务

40+阅读 · 2022年2月28日

【KDD2021】面向多样推荐的滑动谱分解

专知会员服务

12+阅读 · 2021年7月13日

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

专知会员服务

136+阅读 · 2020年3月8日

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知

10+阅读 · 2022年2月28日

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

专知

19+阅读 · 2021年3月28日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知

15+阅读 · 2020年7月23日

论文浅尝 | GEOM-GCN: Geometric Graph Convolutional Networks

论文浅尝 | GEOM-GCN: Geometric Graph Convolutional Networks

开放知识图谱

14+阅读 · 2020年4月8日

【CVPR2020-清华大学】渐进对抗网络的细粒度域适应，Progressive Adversarial Networks

【CVPR2020-清华大学】渐进对抗网络的细粒度域适应，Progressive Adversarial Networks

专知

31+阅读 · 2020年4月4日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

专知

11+阅读 · 2020年3月17日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

31+阅读 · 2018年7月12日

不确定分数阶非线性系统Mittag-Leffler自适应控制

国家自然科学基金

1+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

基于DASH的交互式三维视频系统建模

国家自然科学基金

1+阅读 · 2015年12月31日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

在复杂几何边界下基于带权最小二乘径向基函数的无网格格子玻尔兹曼流体仿真方法及其可视化研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于高空间分辨电子显微学In2-xGaxO3(ZnO)m缺陷分析

国家自然科学基金

0+阅读 · 2015年12月31日

复杂构型下多介质流体力学ALE方法

国家自然科学基金

0+阅读 · 2014年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

面向时空变化的GIS数据模型

国家自然科学基金

6+阅读 · 2014年12月31日

Rapid Variable Resolution Particle Initialization for Complex Geometries

Arxiv

0+阅读 · 2025年12月31日

AODDiff: Probabilistic Reconstruction of Aerosol Optical Depth via Diffusion-based Bayesian Inference

Arxiv

0+阅读 · 2025年12月31日

CREPES-X: Hierarchical Bearing-Distance-Inertial Direct Cooperative Relative Pose Estimation System

Arxiv

0+阅读 · 2025年12月31日

Heteroscedastic Bayesian Optimization-Based Dynamic PID Tuning for Accurate and Robust UAV Trajectory Tracking

Arxiv

0+阅读 · 2025年12月30日

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Arxiv

0+阅读 · 2025年12月29日

Deterministic Image-to-Image Translation via Denoising Brownian Bridge Models with Dual Approximators

Arxiv

0+阅读 · 2025年12月29日

GVSynergy-Det: Synergistic Gaussian-Voxel Representations for Multi-View 3D Object Detection

Arxiv

0+阅读 · 2025年12月29日

A Lightweight Coordinate-Conditioned Diffusion Approach for 6G C-V2X Radio Environment Maps

Arxiv

0+阅读 · 2025年12月27日

Explainable Multimodal Regression via Information Decomposition

Arxiv

0+阅读 · 2025年12月26日

World-Coordinate Human Motion Retargeting via SAM 3D Body

Arxiv

0+阅读 · 2025年12月25日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2025】GCAL：使图模型适应不断演变的领域偏移

【ICML2025】GCAL：使图模型适应不断演变的领域偏移

专知会员服务

9+阅读 · 2025年5月23日

【NeurIPS2024】几何轨迹扩散模型

【NeurIPS2024】几何轨迹扩散模型

专知会员服务

24+阅读 · 2024年10月20日

【NeurIPS 2024 Oral】用于多条件分子生成的图扩散Transformer

【NeurIPS 2024 Oral】用于多条件分子生成的图扩散Transformer

专知会员服务

16+阅读 · 2024年10月5日

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

专知会员服务

24+阅读 · 2023年5月10日

【AAAI2023】面向领域自适应语义分割的几何感知网络

【AAAI2023】面向领域自适应语义分割的几何感知网络

专知会员服务

21+阅读 · 2022年12月7日

斯坦福MIT-CMU【NeurIPS 2022】条件GANs和扩散模型的有效空间稀疏推断

斯坦福MIT-CMU【NeurIPS 2022】条件GANs和扩散模型的有效空间稀疏推断

专知会员服务

26+阅读 · 2022年11月5日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知会员服务

40+阅读 · 2022年2月28日

【KDD2021】面向多样推荐的滑动谱分解

专知会员服务

12+阅读 · 2021年7月13日

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

基于动态时空图CNNs的交通流预测，Dynamic Spatio-temporal Graph-based CNNs for Traffic Flow Prediction

专知会员服务

136+阅读 · 2020年3月8日

热门VIP内容

开通专知VIP会员享更多权益服务

论学习、公平性与复杂度

《整合杀伤链：一个用于边缘目标验证与战术推理的零样本框架》最新资料

2025中国人工智能学会系列白皮书⸺棋盘上的人工智能|附下载

通用智能体评估的逻辑架构

相关资讯

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知

10+阅读 · 2022年2月28日

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

【CVPR2021】CausalVAE: 引入因果结构的解耦表示学习

专知

19+阅读 · 2021年3月28日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知

15+阅读 · 2020年7月23日

论文浅尝 | GEOM-GCN: Geometric Graph Convolutional Networks

论文浅尝 | GEOM-GCN: Geometric Graph Convolutional Networks

开放知识图谱

14+阅读 · 2020年4月8日

【CVPR2020-清华大学】渐进对抗网络的细粒度域适应，Progressive Adversarial Networks

【CVPR2020-清华大学】渐进对抗网络的细粒度域适应，Progressive Adversarial Networks

专知

31+阅读 · 2020年4月4日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

【阿里巴巴-WWW2020】对抗性多模态表示学习的点击率预测，Adversarial Multimodal RL

专知

11+阅读 · 2020年3月17日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知

245+阅读 · 2019年11月18日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

31+阅读 · 2018年7月12日

相关论文

Rapid Variable Resolution Particle Initialization for Complex Geometries

Arxiv

0+阅读 · 2025年12月31日

AODDiff: Probabilistic Reconstruction of Aerosol Optical Depth via Diffusion-based Bayesian Inference

Arxiv

0+阅读 · 2025年12月31日

CREPES-X: Hierarchical Bearing-Distance-Inertial Direct Cooperative Relative Pose Estimation System

Arxiv

0+阅读 · 2025年12月31日

Heteroscedastic Bayesian Optimization-Based Dynamic PID Tuning for Accurate and Robust UAV Trajectory Tracking

Arxiv

0+阅读 · 2025年12月30日

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Arxiv

0+阅读 · 2025年12月29日

Deterministic Image-to-Image Translation via Denoising Brownian Bridge Models with Dual Approximators

Arxiv

0+阅读 · 2025年12月29日

GVSynergy-Det: Synergistic Gaussian-Voxel Representations for Multi-View 3D Object Detection

Arxiv

0+阅读 · 2025年12月29日

A Lightweight Coordinate-Conditioned Diffusion Approach for 6G C-V2X Radio Environment Maps

Arxiv

0+阅读 · 2025年12月27日

Explainable Multimodal Regression via Information Decomposition

Arxiv

0+阅读 · 2025年12月26日

World-Coordinate Human Motion Retargeting via SAM 3D Body

Arxiv

0+阅读 · 2025年12月25日

相关基金

不确定分数阶非线性系统Mittag-Leffler自适应控制

国家自然科学基金

1+阅读 · 2016年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

基于DASH的交互式三维视频系统建模

国家自然科学基金

1+阅读 · 2015年12月31日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

在复杂几何边界下基于带权最小二乘径向基函数的无网格格子玻尔兹曼流体仿真方法及其可视化研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于高空间分辨电子显微学In2-xGaxO3(ZnO)m缺陷分析

国家自然科学基金

0+阅读 · 2015年12月31日

复杂构型下多介质流体力学ALE方法

国家自然科学基金

0+阅读 · 2014年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

面向时空变化的GIS数据模型

国家自然科学基金

6+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员