We derive a class of divergences measuring the difference between probability density functions on the one-dimensional sample space. This divergence is a one-parameter variation of the Itakura--Saito divergence between quantile density functions. We prove that the proposed divergence is a one-parameter variation of the transport Kullback-Leibler divergence and the Hessian distance of negative Boltzmann entropy with respect to the Wasserstein-$2$ metric. From Taylor expansions, we also formulate the $3$-symmetric tensor in Wasserstein-$2$ space, which is given by an iterative Gamma three operator. The alpha--geodesic on Wasserstein space is also derived. From these properties, we name the proposed divergences transport alpha divergences. We provide several examples of transport alpha divergences on one dimensional distributions, such as generative models and Cauchy distributions.


翻译:我们推导了一类用于衡量一维样本空间上概率密度函数差异的散度。该散度是分位数密度函数间Itakura--Saito散度的单参数变体。我们证明了所提出的散度是传输Kullback-Leibler散度的单参数变体,同时也是负玻尔兹曼熵关于Wasserstein-$2$度量的Hessian距离。通过泰勒展开,我们还构建了Wasserstein-$2$空间中的$3$-对称张量,该张量由迭代Gamma三阶算子给出。同时推导了Wasserstein空间上的alpha--测地线。基于这些特性,我们将所提出的散度命名为传输alpha散度。我们提供了若干一维分布上传输alpha散度的实例,例如生成模型和柯西分布。

0
下载
预览

The goal of $L$-step speculative decoding is to accelerate autoregressive decoding of a target model by using a cheaper draft model to generate a candidate path of $L$ tokens. Based on a verification algorithm involving target and draft model probabilities, a prefix of the candidate sequence is accepted, and an additional correction token is sampled from a residual distribution to ensure that the final output adheres to the target distribution. While standard speculative decoding uses a verification algorithm which is independent at each token on the path, a recent extension called block verification uses a joint condition involving all sampled on-path probabilities. Block verification (BV) was shown to be optimal over all verification algorithms which use only on-path probabilities, improving on standard speculative decoding. In this work, we first show that block verification is optimal even over verification algorithms that use off-path probabilities, by constructing an information-agnostic linear program (LP). Further, we can extend our LP to the setting where the draft model samples multiple candidate paths, and use it to construct a natural class of multi-path block verification generalizations. While computing the optimal algorithm in this class is not tractable, by considering a stricter class of greedy algorithms, we can formulate an efficient method called greedy multi-path block verification (GBV). Empirically, GBV can improve block efficiency by over 30% and reduce decoding walltimes by over 15% relative to BV. On Llama-3 70B, GBV can improve the end-to-end decoding throughput over SOTA multi-path verification methods by more than 15%.


翻译:$L$步推测解码的目标是通过使用更廉价的草稿模型生成包含$L$个标记的候选路径,从而加速目标模型的自回归解码。基于涉及目标模型和草稿模型概率的验证算法,候选序列的一个前缀被接受,并从残差分布中采样一个额外的修正标记,以确保最终输出符合目标分布。虽然标准推测解码使用的验证算法在路径的每个标记处独立,但最近提出的块验证扩展采用了一个涉及所有路径上采样概率的联合条件。块验证(BV)被证明在所有仅使用路径上概率的验证算法中是最优的,优于标准推测解码。在本工作中,我们首先通过构建一个信息无关的线性规划(LP),证明块验证即使在可以使用路径外概率的验证算法中也是最优的。进一步,我们可以将线性规划扩展到草稿模型采样多条候选路径的设置,并用它构建一类自然的多路径块验证推广方法。虽然计算此类中的最优算法是棘手的,但通过考虑一类更严格的贪婪算法,我们可以提出一种称为贪婪多路径块验证(GBV)的高效方法。实验表明,相对于BV,GBV可以将块效率提升超过30%,并将解码时间减少超过15%。在Llama-3 70B上,GBV可以将端到端解码吞吐量相较于最先进的多路径验证方法提升超过15%。

0
下载
预览

Communication over a quantum multiple access channel (MAC) is considered with classical feedback. Since the no-cloning prohibits universal copying of arbitrary quantum states, classical feedback is generated through measurement. An achievable rate region is derived using partial information decoding at each transmitter. Our region generalizes both the classical Cover-Leung region and the generalized feedback region. As an example, we show that the qubit SWAP channel can benefit from feedback.


翻译:本文研究了具有经典反馈的量子多址信道通信。由于不可克隆定理禁止对任意量子态进行普适复制,经典反馈通过测量生成。我们通过在每个发射端采用部分信息解码,推导出一个可达速率区域。该区域同时推广了经典的Cover-Leung区域与广义反馈区域。作为示例,我们证明量子比特交换信道能够从反馈中获益。

0
下载
预览

A new framework is introduced for examining and evaluating the fundamental limits of lossless data compression, that emphasizes genuinely non-asymptotic results. The {\em sample complexity} of compressing a given source is defined as the smallest blocklength at which it is possible to compress that source at a specified rate and to within a specified excess-rate probability. This formulation parallels corresponding developments in statistics and computer science, and it facilitates the use of existing results on the sample complexity of various hypothesis testing problems. For arbitrary sources, the sample complexity of general variable-length compressors is shown to be tightly coupled with the sample complexity of prefix-free codes and fixed-length codes. For memoryless sources, it is shown that the sample complexity is characterized not by the source entropy, but by its Rényi entropy of order~$1/2$. Nonasymptotic bounds on the sample complexity are obtained, with explicit constants. Generalizations to Markov sources are established, showing that the sample complexity is determined by the source's Rényi entropy rate of order~$1/2$. Finally, bounds on the sample complexity of universal data compression are developed for arbitrary families of memoryless sources. There, the sample complexity is characterized by the minimum Rényi divergence of order~$1/2$ between elements of the family and the uniform distribution. The connection of this problem with identity testing and with the associated separation rates is explored and discussed.


翻译:本文引入了一个用于检验和评估无损数据压缩基本极限的新框架,该框架强调真正的非渐近结果。压缩给定信源的**样本复杂度**定义为:在指定压缩率和指定超速率概率下,能够压缩该信源所需的最小分组长度。这一表述与统计学和计算机科学中的相应发展相平行,并便于利用关于各类假设检验问题样本复杂度的现有结果。对于任意信源,证明了通用变长压缩器的样本复杂度与无前缀码和定长码的样本复杂度紧密耦合。对于无记忆信源,研究表明样本复杂度并非由信源熵表征,而是由其阶数为~$1/2$的Rényi熵所决定。文中获得了具有显式常数的样本复杂度非渐近界。进一步推广至马尔可夫信源,证明其样本复杂度由信源的阶数为~$1/2$的Rényi熵率决定。最后,针对任意无记忆信源族,推导了通用数据压缩样本复杂度的界限。该情形下的样本复杂度由信源族中元素与均匀分布之间阶数为~$1/2$的最小Rényi散度所表征。本文还探讨并讨论了该问题与恒等检验及相应分离速率之间的关联。

0
下载
预览

Quantum information decoupling is a fundamental primitive in quantum information theory, underlying various applications in quantum physics. We prove a novel one-shot decoupling theorem formulated in terms of quantum relative entropy distance, with the decoupling error bounded by two sandwiched Rényi conditional entropies. In the asymptotic i.i.d. setting of standard information decoupling via partial trace, we show that this bound is ensemble-tight in quantum relative entropy distance and thereby yields a characterization of the associated decoupling error exponent in the low-cost-rate regime. Leveraging this framework, we derive several operational applications formulated in terms of purified distance: (i) a single-letter expression for the exact error exponent of quantum state merging in terms of Petz-Rényi conditional entropies, and (ii) regularized expressions for the achievable error exponent of entanglement distillation and quantum channel coding in terms of Petz-Rényi coherent informations. We further prove that these achievable bounds are tight for maximally correlated states and generalized dephasing channels, respectively, for the high distillation-rate/coding-rate regimes.


翻译:量子信息解耦是量子信息论中的一项基本原语,支撑着量子物理学中的多种应用。我们证明了一种新颖的单阶解耦定理,该定理以量子相对熵距离表述,其解耦误差由两个夹层Rényi条件熵界定。在通过偏迹进行标准信息解耦的渐近独立同分布设定下,我们证明该界在量子相对熵距离下是系综紧致的,从而在低代价率区域给出了相关解耦误差指数的刻画。利用此框架,我们推导了若干以纯化距离表述的操作性应用:(i) 量子态合并的精确误差指数以Petz-Rényi条件熵表示的单字母表达式,以及(ii) 以Petz-Rényi相干信息表示的纠缠蒸馏与量子信道编码可达误差指数的正则化表达式。我们进一步证明,对于高蒸馏率/编码率区域,这些可达界分别对于最大关联态和广义退相位信道是紧致的。

0
下载
预览

Envelope extraction in nuclear magnetic resonance (NMR) is a fundamental step for processing the data space generated by this technique. Envelope detection accuracy improves with increasing the number of sampling points; however, we propose a novel transform that enables acceptable envelope extraction with significantly fewer sampling points, even without meeting the Nyquist rate. In this paper, we challenge the traditional scale definition and demonstrate that classic scaling lacks a physical referent in all situations. To achieve this aim, we introduce a scale based on the variations of space-invariant states, rather than the observable characteristics of matter and energy. According to this definition of the scale, we distinguished two kinds of observers: scale-variant and scale-invariant. We demonstrated that converting a scale-variant observer to a scale-invariant observer is equivalent to envelop extraction. To analyse and study the theories presented in the paper, we have designed and implemented an Earth-field NMR setup and used real data generated by it to evaluate the performance of the proposed envelope-detection transform. We compared the output of the proposed transform with that of classic and state-of-the-art methods for parameter recovery of NMR signals.


翻译:核磁共振(NMR)中的包络提取是处理该技术生成数据空间的基本步骤。包络检测精度通常随采样点数量的增加而提高;然而,我们提出了一种新型变换,即使在不满足奈奎斯特采样率的情况下,也能以显著更少的采样点实现可接受的包络提取。本文挑战了传统的尺度定义,并证明经典尺度在所有情况下都缺乏物理参照。为实现这一目标,我们引入了一种基于空间不变态变化的尺度,而非基于物质与能量的可观测特性。根据这一定义,我们区分了两类观测者:尺度变异观测者与尺度不变观测者。我们证明了将尺度变异观测者转换为尺度不变观测者等价于包络提取。为分析和验证本文提出的理论,我们设计并搭建了一套地磁场核磁共振装置,并使用其产生的真实数据评估了所提包络检测变换的性能。我们将所提变换的输出与经典及前沿方法在核磁共振信号参数恢复方面进行了比较。

0
下载
预览

Quantum communication holds the potential to revolutionize information transmission by enabling secure data exchange that exceeds the limits of classical systems. One of the key performance metrics in quantum information theory, namely the Holevo bound, quantifies the amount of classical information that can be transmitted reliably over a quantum channel. However, computing and optimizing the Holevo bound remains a challenging task due to its dependence on both the quantum input ensemble and the quantum channel. In order to maximize the Holevo bound, we propose a unified projected gradient ascent algorithm to optimize the quantum channel given a fixed input ensemble. We provide a detailed complexity analysis for the proposed algorithm. Simulation results demonstrate that the proposed quantum channel optimization yields higher Holevo bounds than input ensemble optimization.


翻译:量子通信有望通过实现超越经典系统极限的安全数据交换来革新信息传输。量子信息理论中的关键性能指标——Holevo界,量化了通过量子信道可可靠传输的经典信息量。然而,由于Holevo界同时依赖于量子输入系综和量子信道,其计算与优化仍是一项具有挑战性的任务。为最大化Holevo界,我们提出了一种统一的投影梯度上升算法,用于在固定输入系综条件下优化量子信道。我们对所提算法进行了详细的复杂度分析。仿真结果表明,所提出的量子信道优化方案比输入系综优化能获得更高的Holevo界。

0
下载
预览

The recovery of unknown signals from quadratic measurements finds extensive applications in fields such as phase retrieval, power system state estimation, and unlabeled distance geometry. This paper investigates the finite sample properties of weakly convex--concave regularized estimators in high-dimensional quadratic measurements models. By employing a weakly convex--concave penalized least squares approach, we establish support recovery and $\ell_2$-error bounds for the local minimizer. To solve the corresponding optimization problem, we adopt two proximal gradient strategies, where the proximal step is computed either in closed form or via a weighted $\ell_1$ approximation, depending on the regularization function. Numerical examples demonstrate the efficacy of the proposed method.


翻译:从二次测量中恢复未知信号在相位恢复、电力系统状态估计和无标记距离几何等领域具有广泛应用。本文研究了高维二次测量模型中弱凸-凹正则化估计器的有限样本性质。通过采用弱凸-凹惩罚最小二乘法,我们为局部极小值点建立了支持恢复与$\ell_2$误差界。为求解相应的优化问题,我们采用两种近端梯度策略:根据正则化函数的不同,近端步长或通过闭式解计算,或通过加权$\ell_1$近似求解。数值算例验证了所提方法的有效性。

0
下载
预览

We characterize information as risk reduction between knowledge states represented by partitions of the underlying probability space. Entropy corresponds to risk reduction from no (or partial) knowledge to full knowledge about a random variable, while information corresponds to risk reduction from no (or partial) knowledge to partial knowledge. This applies to any information measure that is based on expected loss minimization, such as Bregman information, with Shannon information and variance as prominent examples. In each case, fundamental properties like the chain rule, non-negativity, and the relationship between information and divergence are preserved. Because partitions form a lattice under refinement, our general treatment reveals how information can be decomposed into redundant, unique, and synergistic contributions, a question important in applications from neuroscience to machine learning, yet one for which existing formulations lack consensus on foundational definitions and can violate basic properties such as the chain rule or non-negativity. Redundancy corresponds to Aumann's common knowledge, synergy to the gap between separately and jointly observed sources, and unique information is necessarily path-dependent, taking different values depending on what is already known. The resulting partial information decomposition is grounded directly in probability theory, avoids treating scalar information quantities as primitive compositional objects, and yields non-negative terms by construction.


翻译:我们将信息表征为底层概率空间划分所表示的知识状态之间的风险降低。熵对应于从对随机变量无(或部分)知识到完全知识的风险降低,而信息则对应于从无(或部分)知识到部分知识的风险降低。这适用于任何基于期望损失最小化的信息度量,例如Bregman信息,其中香农信息与方差是突出实例。在每种情况下,链式法则、非负性以及信息与散度关系等基本性质均得以保持。由于划分在细化关系下构成格结构,我们的通用处理揭示了信息如何能被分解为冗余、独特与协同贡献,这一从神经科学到机器学习的应用中至关重要的问题,其现有表述在基础定义上缺乏共识,且可能违反链式法则或非负性等基本性质。冗余对应奥曼的公共知识,协同对应分别观测与联合观测源之间的差距,而独特信息必然具有路径依赖性,其取值取决于已有知识。由此得到的部分信息分解直接植根于概率论,避免将标量信息量视为原始组合对象,并通过构造方式产生非负项。

0
下载
预览

This two-part paper aims to develop an environment-aware network-level design framework for generalized pinching-antenna systems to overcome the limitations of conventional link-level optimization, which is tightly coupled to instantaneous user geometry and thus sensitive to user mobility and localization errors. Part I investigates the traffic-aware case, where user presence is characterized statistically by a spatial traffic map and deployments are optimized using traffic-aware network-level metrics. Part II complements Part I by developing geometry-aware, blockage-aware network optimization for pinching-antenna systems in obstacle-rich environments. We introduce a grid-level average signal-to-noise (SNR) model with a deterministic LoS visibility indicator and a discrete activation architecture, where the geometry-dependent terms are computed offline in advance. Building on this model, we formulate two network-level activation problems: (i) average-SNR-threshold coverage maximization and (ii) fairness-oriented worst-grid average-SNR maximization. On the algorithmic side, we prove the coverage problem is NP-hard and derive an equivalent mix-integer linear programming reformulation through binary coverage variables and linear SNR linking constraints. To achieve scalability, we further develop a structure-exploiting coordinate-ascent method that updates one waveguide at a time using precomputed per-candidate SNR contributions. For the worst-grid objective, we adopt an epigraph reformulation and leverage the resulting monotone feasibility in the target SNR, enabling an efficient bisection-based solver with low-complexity feasibility checks over the discrete candidate set. Simulations results validate the proposed designs and quantify their gains under different environments and system parameters.


翻译:本系列论文旨在为广义夹持天线系统构建一个环境感知的网络级设计框架,以克服传统链路级优化的局限性——传统方法紧密耦合于瞬时用户几何分布,因而对用户移动性和定位误差极为敏感。第一部分研究了流量感知场景,其中用户存在性通过空间流量图进行统计表征,并采用流量感知的网络级指标优化部署方案。第二部分作为第一部分的补充,针对障碍物密集环境中的夹持天线系统,发展了几何感知、遮挡感知的网络优化方法。我们提出了一个网格级平均信噪比模型,该模型包含确定性的视距可见性指示器与离散激活架构,其中几何依赖项可预先离线计算。基于此模型,我们构建了两个网络级激活问题:(i) 平均信噪比阈值覆盖最大化;(ii) 面向公平性的最差网格平均信噪比最大化。在算法层面,我们证明了覆盖问题是NP难问题,并通过二元覆盖变量与线性信噪比关联约束推导出等效的混合整数线性规划重构形式。为实现可扩展性,我们进一步开发了一种结构利用型坐标上升法,该方法利用预计算的各候选信噪比贡献度,每次仅更新一个波导。针对最差网格目标函数,我们采用上镜图重构并利用目标信噪比中产生的单调可行性,从而构建了一个基于二分法的高效求解器,该求解器可在离散候选集上进行低复杂度的可行性检验。仿真结果验证了所提设计的有效性,并在不同环境与系统参数下量化了其性能增益。

0
下载
预览
登陆后查看更多精品内容
VIP会员
本周荟萃主题
区块链
区块链(Blockchain)是由节点参与的分布式数据库系统,它的特点是不可更改,不可伪造,也可以将其理解为账簿系统(ledger)。它是比特币的一个重要概念,完整比特币区块链的副本,记录了其代币(token)的每一笔交易。通过这些信息,我们可以找到每一个地址,在历史上任何一点所拥有的价值。
深度学习
机器学习的一个分支,它基于试图使用包含复杂结构或由多重非线性变换构成的多个处理层对数据进行高层抽象的一系列算法。
机器学习
“机器学习是近20多年兴起的一门多领域交叉学科,涉及概率论、统计学、逼近论、凸分析、算法复杂度理论等多门学科。机器学习理论主要是设计和分析一些让 可以自动“ 学习”的算法。机器学习算法是一类从数据中自动分析获得规律,并利用规律对未知数据进行预测的算法。因为学习算法中涉及了大量的统计学理论,机器学习与统计推断学联系尤为密切,也被称为统计学习理论。算法设计方面,机器学习理论关注可以实现的,行之有效的学习算法。很多 推论问题属于 无程序可循难度,所以部分的机器学习研究是开发容易处理的近似算法。”

——中文维基百科
强化学习
强化学习(RL)是机器学习的一个领域,与软件代理应如何在环境中采取行动以最大化累积奖励的概念有关。除了监督学习和非监督学习外,强化学习是三种基本的机器学习范式之一。 强化学习与监督学习的不同之处在于,不需要呈现带标签的输入/输出对,也不需要显式纠正次优动作。相反,重点是在探索(未知领域)和利用(当前知识)之间找到平衡。 该环境通常以马尔可夫决策过程(MDP)的形式陈述,因为针对这种情况的许多强化学习算法都使用动态编程技术。经典动态规划方法和强化学习算法之间的主要区别在于,后者不假设MDP的确切数学模型,并且针对无法采用精确方法的大型MDP。
推荐系统
推荐系统,是指根据用户的习惯、偏好或兴趣,从不断到来的大规模信息中识别满足用户兴趣的信息的过程。推荐推荐任务中的信息往往称为物品(Item)。根据具体应用背景的不同,这些物品可以是新闻、电影、音乐、广告、商品等各种对象。推荐系统利用电子商务网站向客户提供商品信息和建议,帮助用户决定应该购买什么产品,模拟销售人员帮助客户完成购买过程。个性化推荐是根据用户的兴趣特点和购买行为,向用户推荐用户感兴趣的信息和商品。随着电子商务规模的不断扩大,商品个数和种类快速增长,顾客需要花费大量的时间才能找到自己想买的商品。这种浏览大量无关的信息和产品过程无疑会使淹没在信息过载问题中的消费者不断流失。为了解决这些问题,个性化推荐系统应运而生。个性化推荐系统是建立在海量数据挖掘基础上的一种高级商务智能平台,以帮助电子商务网站为其顾客购物提供完全个性化的决策支持和信息服务。
卷积神经网络
在深度学习中,卷积神经网络(CNN或ConvNet)是一类深度神经网络,最常用于分析视觉图像。基于它们的共享权重架构和平移不变性特征,它们也被称为位移不变或空间不变的人工神经网络(SIANN)。它们在图像和视频识别,推荐系统,图像分类,医学图像分析,自然语言处理,和财务时间序列中都有应用。
计算机网络
计算机网络( Computer Networks )指将地理位置不同的多台计算机及其外部设备,通过通信线路连接起来,在网络操作系统及网络通信协议的管理和协调下,实现资源共享和信息传递的计算机系统。
命名实体识别
命名实体识别(NER)(也称为实体标识,实体组块和实体提取)是信息抽取的子任务,旨在将非结构化文本中提到的命名实体定位和分类为预定义类别,例如人员姓名、地名、机构名、专有名词等。
机器翻译
机器翻译,又称为自动翻译,是利用计算机将一种自然语言(源语言)转换为另一种自然语言(目标语言)的过程。它是计算语言学的一个分支,是人工智能的终极目标之一,具有重要的科学研究价值。
计算机视觉
计算机视觉是一门研究如何使机器“看”的科学,更进一步的说,就是是指用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取‘信息’的人工智能系统。
微信扫码咨询专知VIP会员