Distributed tensor decomposition (DTD) is a fundamental data-analytics technique that extracts latent important properties from high-dimensional multi-attribute datasets distributed over edge devices. Conventionally its wireless implementation follows a one-shot approach that first computes local results at devices using local data and then aggregates them to a server with communication-efficient techniques such as over-the-air computation (AirComp) for global computation. Such implementation is confronted with the issues of limited storage-and-computation capacities and link interruption, which motivates us to propose a framework of on-the-fly communication-and-computing (FlyCom$^2$) in this work. The proposed framework enables streaming computation with low complexity by leveraging a random sketching technique and achieves progressive global aggregation through the integration of progressive uploading and multiple-input-multiple-output (MIMO) AirComp. To develop FlyCom$^2$, an on-the-fly sub-space estimator is designed to take real-time sketches accumulated at the server to generate online estimates for the decomposition. Its performance is evaluated by deriving both deterministic and probabilistic error bounds using the perturbation theory and concentration of measure. Both results reveal that the decomposition error is inversely proportional to the population of sketching observations received by the server. To further rein in the noise effect on the error, we propose a threshold-based scheme to select a subset of sufficiently reliable received sketches for DTD at the server. Experimental results validate the performance gain of the proposed selection algorithm and show that compared to its one-shot counterparts, the proposed FlyCom$^2$ achieves comparable (even better in the case of large eigen-gaps) decomposition accuracy besides dramatically reducing devices' complexity costs.
翻译:分布式张量分解(Distributed Tensor Decomposition, DTD)是一种从分布于边缘设备的高维多属性数据集中提取潜在重要特征的基本数据分析技术。传统无线实现采用一次性方法,先在设备端利用本地数据计算局部结果,再通过空中计算(Over-the-Air Computation, AirComp)等通信高效技术将结果汇聚至服务器进行全局计算。此类实现面临存储与计算能力有限及链路中断等问题,这促使本文提出一种即时通信与计算(On-the-Fly Communication-and-Computing, FlyCom$^2$)框架。该框架通过引入随机草图技术实现低复杂度流式计算,并融合渐进上传与多输入多输出(Multiple-Input-Multiple-Output, MIMO)空中计算实现渐进式全局聚合。为开发FlyCom$^2$,我们设计了即时子空间估计器,利用服务器端实时累积的草图数据生成分解的在线估计。通过扰动论和测度集中方法推导出确定性与概率性误差界来评估其性能。两者均表明分解误差与服务器接收的草图观测数量成反比。为抑制噪声对误差的影响,我们提出基于阈值的方案,选取足够可靠的草图观测子集用于服务器端DTD。实验结果验证了所提选择算法的性能增益,并表明与一次性方法相比,所提FlyCom$^2$在实现相当(甚至在特征间隙较大时更优)分解精度的同时,大幅降低了设备的计算复杂度。