A subset of points in a metric space is said to resolve it if each point in the space is uniquely characterized by its distance to each point in the subset. In particular, resolving sets can be used to represent points in abstract metric spaces as Euclidean vectors. Importantly, due to the triangle inequality, points close by in the space are represented as vectors with similar coordinates, which may find applications in classification problems of symbolic objects under suitably chosen metrics. In this manuscript, we address the resolvability of Jaccard spaces, i.e., metric spaces of the form $(2^X,\text{Jac})$, where $2^X$ is the power set of a finite set $X$, and $\text{Jac}$ is the Jaccard distance between subsets of $X$. Specifically, for different $a,b\in 2^X$, $\text{Jac}(a,b)=|a\Delta b|/|a\cup b|$, where $|\cdot|$ denotes size (i.e., cardinality) and $\Delta$ denotes the symmetric difference of sets. We combine probabilistic and linear algebra arguments to construct highly likely but nearly optimal (i.e., of minimal size) resolving sets of $(2^X,\text{Jac})$. In particular, we show that the metric dimension of $(2^X,\text{Jac})$, i.e., the minimum size of a resolving set of this space, is $\Theta(|X|/\ln|X|)$. In addition, we show that a much smaller subset of $2^X$ suffices to resolve, with high probability, all different pairs of subsets of $X$ of cardinality at most $\sqrt{|X|}/\ln|X|$, up to a factor.
翻译:若度量空间中某个点集的每个点都能通过其到该子集中各点的距离唯一确定,则称该子集可分辨该空间。特别地,分辨集可用于将抽象度量空间中的点表示为欧几里得向量。重要的是,根据三角不等式,空间中邻近的点会被表示为坐标相似的向量,这可能在符号对象在适当选择度量下的分类问题中找到应用。本文研究了Jaccard空间的可分辨性,即形如$(2^X,\text{Jac})$的度量空间,其中$2^X$是有限集$X$的幂集,$\text{Jac}$是$X$子集间的Jaccard距离。具体而言,对于不同的$a,b\in 2^X$,$\text{Jac}(a,b)=|a\Delta b|/|a\cup b|$,其中$|\cdot|$表示大小(即基数),$\Delta$表示集合的对称差。我们结合概率论与线性代数论证,构造了$(2^X,\text{Jac})$的高概率且近乎最优(即最小规模)的分辨集。特别地,我们证明了该空间的度量维数(即分辨集的最小规模)为$\Theta(|X|/\ln|X|)$。此外,我们证明$2^X$的一个更小子集足以以高概率分辨所有基数至多为$\sqrt{|X|}/\ln|X|$的$X$的不同子集对(相差一个因子)。