Distributed tensor decomposition (DTD) is a fundamental data-analytics technique that extracts latent important properties from high-dimensional multi-attribute datasets distributed over edge devices. Conventionally its wireless implementation follows a one-shot approach that first computes local results at devices using local data and then aggregates them to a server with communication-efficient techniques such as over-the-air computation (AirComp) for global computation. Such implementation is confronted with the issues of limited storage-and-computation capacities and link interruption, which motivates us to propose a framework of on-the-fly communication-and-computing (FlyCom$^2$) in this work. The proposed framework enables streaming computation with low complexity by leveraging a random sketching technique and achieves progressive global aggregation through the integration of progressive uploading and multiple-input-multiple-output (MIMO) AirComp. To develop FlyCom$^2$, an on-the-fly sub-space estimator is designed to take real-time sketches accumulated at the server to generate online estimates for the decomposition. Its performance is evaluated by deriving both deterministic and probabilistic error bounds using the perturbation theory and concentration of measure. Both results reveal that the decomposition error is inversely proportional to the population of sketching observations received by the server. To further rein in the noise effect on the error, we propose a threshold-based scheme to select a subset of sufficiently reliable received sketches for DTD at the server. Experimental results validate the performance gain of the proposed selection algorithm and show that compared to its one-shot counterparts, the proposed FlyCom$^2$ achieves comparable (even better in the case of large eigen-gaps) decomposition accuracy besides dramatically reducing devices' complexity costs.
翻译:分布式张量分解(DTD)是一种基础性数据分析技术,用于从分布在边缘设备上的高维多属性数据集中提取潜在的重要特性。传统无线实现采用一次性方法:首先利用本地数据在设备端计算局部结果,再通过空中计算(AirComp)等高效通信技术将结果聚合到服务器完成全局计算。这种实现面临存储与计算能力有限及链路中断等问题,为此我们提出在线通信与计算(FlyCom²)框架。该框架通过引入随机草图技术实现低复杂度流式计算,并利用渐进上传与多输入多输出(MIMO)空中计算的融合实现渐进式全局聚合。为开发FlyCom²,设计了在线子空间估计器,通过累积服务器端的实时草图生成分解的在线估计。利用摄动理论和测度集中方法导出确定性误差界与概率误差界,对性能进行理论评估。两种结果均表明:分解误差与服务器接收的草图观测数量成反比。为抑制噪声对误差的影响,提出基于阈值的方案,在服务器端选择足够可靠的草图观测子集进行DTD。实验结果验证了所提选择算法的性能增益,并表明:与一次性方法相比,所提FlyCom²在显著降低设备复杂度开销的同时,实现了相当(在大特征间隙情况下甚至更优)的分解精度。