超越实值权重：用于稳定量化的超复数表示 (Beyond Real Weights: Hypercomplex Representations for Stable Quantization) - 专知论文

会员服务 ·

0

超复数 · 表示 · 多模 · 模态 · 多模态 ·

2025 年 12 月 9 日

Beyond Real Weights: Hypercomplex Representations for Stable Quantization

翻译：超越实值权重：用于稳定量化的超复数表示

Jawad Ibn Ahad,Maisha Rahman,Amrijit Biswas,Muhammad Rafsan Kabir,Robin Krambroeckers,Sifat Momen,Nabeel Mohammed,Shafin Rahman

from arxiv, Accepted in Winter Conference on Applications of Computer Vision (WACV) 2026

Multimodal language models (MLLMs) require large parameter capacity to align high-dimensional visual features with linguistic representations, making them computationally heavy and difficult to deploy efficiently. We introduce a progressive reparameterization strategy that compresses these models by gradually replacing dense feed-forward network blocks with compact Parameterized Hypercomplex Multiplication (PHM) layers. A residual interpolation schedule, together with lightweight reconstruction and knowledge distillation losses, ensures that the PHM modules inherit the functional behavior of their dense counterparts during training. This transition yields substantial parameter and FLOP reductions while preserving strong multimodal alignment, enabling faster inference without degrading output quality. We evaluate the approach on multiple vision-language models (VLMs). Our method maintains performance comparable to the base models while delivering significant reductions in model size and inference latency. Progressive PHM substitution thus offers an architecture-compatible path toward more efficient multimodal reasoning and complements existing low-bit quantization techniques.

翻译：多模态语言模型（MLLMs）需要庞大的参数量来对齐高维视觉特征与语言表示，导致其计算负担沉重且难以高效部署。本文提出一种渐进式重参数化策略，通过逐步将密集的前馈网络块替换为紧凑的参数化超复数乘法（PHM）层来压缩这些模型。结合残差插值调度以及轻量级重构与知识蒸馏损失，确保PHM模块在训练过程中继承其密集对应层的功能行为。这一转换实现了显著的参数与浮点运算量削减，同时保持了强大的多模态对齐能力，从而在不降低输出质量的前提下实现更快的推理速度。我们在多个视觉-语言模型（VLMs）上评估了该方法。实验表明，我们的方法在保持与基线模型相当性能的同时，显著减少了模型规模与推理延迟。因此，渐进式PHM替换为更高效的多模态推理提供了一条架构兼容的路径，并可作为现有低位量化技术的有效补充。

0

相关内容

超复数

用于多模态对齐的基础模型表征潜力：一项综述

用于多模态对齐的基础模型表征潜力：一项综述

专知会员服务

18+阅读 · 2025年10月8日

【CVPR2024】医学基础模型的低秩知识分解

【CVPR2024】医学基础模型的低秩知识分解

专知会员服务

35+阅读 · 2024年4月29日

【AAAI2024】使用大型语言模型的生成式多模态知识检索

【AAAI2024】使用大型语言模型的生成式多模态知识检索

专知会员服务

58+阅读 · 2024年1月19日

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

专知会员服务

24+阅读 · 2023年5月10日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【NeurIPS2020】可处理的反事实推理的深度结构因果模型

【NeurIPS2020】可处理的反事实推理的深度结构因果模型

专知会员服务

49+阅读 · 2020年9月28日

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

专知会员服务

44+阅读 · 2020年6月29日

Time2Vec：学习时间的向量表示，Time2Vec: Learning a Vector Representation of Time

Time2Vec：学习时间的向量表示，Time2Vec: Learning a Vector Representation of Time

专知会员服务

36+阅读 · 2020年5月10日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

自回归模型:PixelCNN

自回归模型:PixelCNN

专知会员服务

29+阅读 · 2020年3月21日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

ICLR 2019 | 基于复杂空间关系旋转的知识表示方法

ICLR 2019 | 基于复杂空间关系旋转的知识表示方法

PaperWeekly

17+阅读 · 2019年7月29日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

31+阅读 · 2018年7月12日

从最大似然到EM算法：一致的理解方式

从最大似然到EM算法：一致的理解方式

PaperWeekly

19+阅读 · 2018年3月19日

网络节点表示学习论文笔记03—基于异构网络节点表示的推荐系统

网络节点表示学习论文笔记03—基于异构网络节点表示的推荐系统

专知

27+阅读 · 2018年2月24日

斯坦福Jure Leskovec图表示学习：无监督和有监督方法（附PPT下载）

斯坦福Jure Leskovec图表示学习：无监督和有监督方法（附PPT下载）

专知

24+阅读 · 2017年12月17日

在TensorFlow中对比两大生成模型：VAE与GAN

在TensorFlow中对比两大生成模型：VAE与GAN

机器之心

12+阅读 · 2017年10月23日

SSD: Single Shot MultiBox Detector 深度学习笔记之SSD物体检测模型

SSD: Single Shot MultiBox Detector 深度学习笔记之SSD物体检测模型

AI研习社

18+阅读 · 2017年8月31日

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

炼数成金订阅号

26+阅读 · 2017年7月10日

MNIST入门：贝叶斯方法

MNIST入门：贝叶斯方法

Python程序员

23+阅读 · 2017年7月3日

粗糙回归模型与算法研究

国家自然科学基金

8+阅读 · 2015年12月31日

测量误差数据下部分线性模型有约束统计推断理论

国家自然科学基金

2+阅读 · 2015年12月31日

考虑材料分布不确定性的结构拓扑优化问题数学建模与求解方法

国家自然科学基金

0+阅读 · 2015年12月31日

低维有限典型群与线传递2-(v,k,1)设计

国家自然科学基金

0+阅读 · 2015年12月31日

基于几何精确理论的大变形柔性多体系统动力学变分李群模型及算法

国家自然科学基金

0+阅读 · 2014年12月31日

Jacobi行列式和Hilbert变换中的若干问题及应用

国家自然科学基金

0+阅读 · 2014年12月31日

Heisenberg群与Minkowski空间中的非线性椭圆方程

国家自然科学基金

0+阅读 · 2014年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

随机系数和带跳的线性随机微分系统的H2/H∞控制

国家自然科学基金

0+阅读 · 2014年12月31日

随机Helmholtz型问题的数值方法

国家自然科学基金

0+阅读 · 2014年12月31日

Video and Language Alignment in 2D Systems for 3D Multi-object Scenes with Multi-Information Derivative-Free Control

Arxiv

0+阅读 · 2025年12月31日

A New Decomposition Paradigm for Graph-structured Nonlinear Programs via Message Passing

Arxiv

0+阅读 · 2025年12月31日

Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling

Arxiv

0+阅读 · 2025年12月30日

NeuroPMD: Neural Fields for Density Estimation on Product Manifolds

Arxiv

0+阅读 · 2025年12月30日

Colorful Pinball: Density-Weighted Quantile Regression for Conditional Guarantee of Conformal Prediction

Arxiv

0+阅读 · 2025年12月30日

OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization

Arxiv

0+阅读 · 2025年12月30日

Database Theory in Action: From Inexpressibility to Efficiency in GQL's Order-Constrained Paths

Database Theory in Action: From Inexpressibility to Efficiency in GQL's Order-Constrained Paths

Arxiv

0+阅读 · 2025年12月29日

Mixture-of-Experts with Gradient Conflict-Driven Subspace Topology Pruning for Emergent Modularity

Arxiv

0+阅读 · 2025年12月29日

Bayesian Semiparametric Orthogonal Tucker Factorized Mixed Models for Multi-dimensional Longitudinal Functional Data

Arxiv

0+阅读 · 2025年12月28日

ReVEAL: GNN-Guided Reverse Engineering for Formal Verification of Optimized Multipliers

Arxiv

0+阅读 · 2025年12月24日

VIP会员

文章信息

相关主题

相关VIP内容

用于多模态对齐的基础模型表征潜力：一项综述

用于多模态对齐的基础模型表征潜力：一项综述

专知会员服务

18+阅读 · 2025年10月8日

【CVPR2024】医学基础模型的低秩知识分解

【CVPR2024】医学基础模型的低秩知识分解

专知会员服务

35+阅读 · 2024年4月29日

【AAAI2024】使用大型语言模型的生成式多模态知识检索

【AAAI2024】使用大型语言模型的生成式多模态知识检索

专知会员服务

58+阅读 · 2024年1月19日

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

【ICML2023】SEGA:结构熵引导的图对比学习锚视图

专知会员服务

24+阅读 · 2023年5月10日

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

【超越消息传递:图神经网络的物理启发范式】Beyond Message Passing: a Physics-Inspired Paradigm for Graph Neural Networks

专知会员服务

17+阅读 · 2022年5月10日

【NeurIPS2020】可处理的反事实推理的深度结构因果模型

【NeurIPS2020】可处理的反事实推理的深度结构因果模型

专知会员服务

49+阅读 · 2020年9月28日

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

专知会员服务

44+阅读 · 2020年6月29日

Time2Vec：学习时间的向量表示，Time2Vec: Learning a Vector Representation of Time

Time2Vec：学习时间的向量表示，Time2Vec: Learning a Vector Representation of Time

专知会员服务

36+阅读 · 2020年5月10日

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

【Mila-Google】使用元学习动态调整源代码模型，On-the-Fly Adaptation of Source Code Models using Meta-Learning

专知会员服务

21+阅读 · 2020年3月28日

自回归模型:PixelCNN

自回归模型:PixelCNN

专知会员服务

29+阅读 · 2020年3月21日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机与战争：被忽视的环境影响及无人机保护潜力》

俄罗斯规划未来无人机驱动军队

《整合杀伤链：一个用于边缘目标验证与战术推理的零样本框架》最新资料

《人工智能、武器与影响力：前沿模型在模拟核危机中展现复杂推理》2026最新46页报告

相关资讯

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

ICLR 2019 | 基于复杂空间关系旋转的知识表示方法

ICLR 2019 | 基于复杂空间关系旋转的知识表示方法

PaperWeekly

17+阅读 · 2019年7月29日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

31+阅读 · 2018年7月12日

从最大似然到EM算法：一致的理解方式

从最大似然到EM算法：一致的理解方式

PaperWeekly

19+阅读 · 2018年3月19日

网络节点表示学习论文笔记03—基于异构网络节点表示的推荐系统

网络节点表示学习论文笔记03—基于异构网络节点表示的推荐系统

专知

27+阅读 · 2018年2月24日

斯坦福Jure Leskovec图表示学习：无监督和有监督方法（附PPT下载）

斯坦福Jure Leskovec图表示学习：无监督和有监督方法（附PPT下载）

专知

24+阅读 · 2017年12月17日

在TensorFlow中对比两大生成模型：VAE与GAN

在TensorFlow中对比两大生成模型：VAE与GAN

机器之心

12+阅读 · 2017年10月23日

SSD: Single Shot MultiBox Detector 深度学习笔记之SSD物体检测模型

SSD: Single Shot MultiBox Detector 深度学习笔记之SSD物体检测模型

AI研习社

18+阅读 · 2017年8月31日

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

语义分割中的深度学习方法全解：从FCN、SegNet到DeepLab

炼数成金订阅号

26+阅读 · 2017年7月10日

MNIST入门：贝叶斯方法

MNIST入门：贝叶斯方法

Python程序员

23+阅读 · 2017年7月3日

相关论文

Video and Language Alignment in 2D Systems for 3D Multi-object Scenes with Multi-Information Derivative-Free Control

Arxiv

0+阅读 · 2025年12月31日

A New Decomposition Paradigm for Graph-structured Nonlinear Programs via Message Passing

Arxiv

0+阅读 · 2025年12月31日

Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling

Arxiv

0+阅读 · 2025年12月30日

NeuroPMD: Neural Fields for Density Estimation on Product Manifolds

Arxiv

0+阅读 · 2025年12月30日

Colorful Pinball: Density-Weighted Quantile Regression for Conditional Guarantee of Conformal Prediction

Arxiv

0+阅读 · 2025年12月30日

OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization

Arxiv

0+阅读 · 2025年12月30日

Database Theory in Action: From Inexpressibility to Efficiency in GQL's Order-Constrained Paths

Database Theory in Action: From Inexpressibility to Efficiency in GQL's Order-Constrained Paths

Arxiv

0+阅读 · 2025年12月29日

Mixture-of-Experts with Gradient Conflict-Driven Subspace Topology Pruning for Emergent Modularity

Arxiv

0+阅读 · 2025年12月29日

Bayesian Semiparametric Orthogonal Tucker Factorized Mixed Models for Multi-dimensional Longitudinal Functional Data

Arxiv

0+阅读 · 2025年12月28日

ReVEAL: GNN-Guided Reverse Engineering for Formal Verification of Optimized Multipliers

Arxiv

0+阅读 · 2025年12月24日

相关基金

粗糙回归模型与算法研究

国家自然科学基金

8+阅读 · 2015年12月31日

测量误差数据下部分线性模型有约束统计推断理论

国家自然科学基金

2+阅读 · 2015年12月31日

考虑材料分布不确定性的结构拓扑优化问题数学建模与求解方法

国家自然科学基金

0+阅读 · 2015年12月31日

低维有限典型群与线传递2-(v,k,1)设计

国家自然科学基金

0+阅读 · 2015年12月31日

基于几何精确理论的大变形柔性多体系统动力学变分李群模型及算法

国家自然科学基金

0+阅读 · 2014年12月31日

Jacobi行列式和Hilbert变换中的若干问题及应用

国家自然科学基金

0+阅读 · 2014年12月31日

Heisenberg群与Minkowski空间中的非线性椭圆方程

国家自然科学基金

0+阅读 · 2014年12月31日

高维数据下的模型平均方法

国家自然科学基金

6+阅读 · 2014年12月31日

随机系数和带跳的线性随机微分系统的H2/H∞控制

国家自然科学基金

0+阅读 · 2014年12月31日

随机Helmholtz型问题的数值方法

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员