We propose a new method for data visualization based on attraction-repulsion swarming (ARS) dynamics, which we call ARS visualization. ARS is a generalized framework that is based on viewing the t-distributed stochastic neighbor embedding (t-SNE) visualization technique as a swarm of interacting agents driven by attraction and repulsion. Motivated by recent developments in swarming, we modify the t-SNE dynamics to include a normalization by the \emph{total influence}, which results in better posed dynamics in which we can use a data size independent time step (of $h=1$) and a simple iteration, without the need for the array of optimization tricks employed in t-SNE. ARS also includes the ability to separately tune the attraction and repulsion kernels, which gives the user control over the tightness within clusters and the spacing between them in the visualization. In contrast with t-SNE, our proposed ARS data visualization method is not gradient descent on the Kullback-Leibler divergence, and can be viewed solely as an interacting particle system driven by attraction and repulsion forces. We provide theoretical results illustrating how the choice of interaction kernel affects the dynamics, and experimental results to validate our method and compare to t-SNE on the MNIST and Cifar-10 data sets.
翻译:我们提出了一种基于吸引力-排斥群体行为动态的新型数据可视化方法,称之为ARS可视化。ARS是一个通用框架,其基础是将t分布随机邻域嵌入可视化技术视为由吸引力和排斥力驱动的交互智能体群体。受群体行为最新进展的启发,我们修改了t-SNE动态,引入了基于总影响力的归一化处理,从而得到更适定的动态系统。该系统可使用与数据规模无关的时间步长(取h=1)和简单迭代,无需采用t-SNE中复杂的优化技巧阵列。ARS还具备分别调节吸引核与排斥核的能力,使用户能够控制可视化中类簇内部的紧密度及类簇间距。与t-SNE不同,我们提出的ARS数据可视化方法并非基于Kullback-Leibler散度的梯度下降,可纯粹视为由吸引力与排斥力驱动的交互粒子系统。我们提供了理论分析以阐明交互核选择如何影响系统动态,并通过在MNIST和Cifar-10数据集上的实验验证了该方法,并与t-SNE进行了对比。