A subset of points in a metric space is said to resolve it if each point in the space is uniquely characterized by its distance to each point in the subset. In particular, resolving sets can be used to represent points in abstract metric spaces as Euclidean vectors. Importantly, due to the triangle inequality, points close by in the space are represented as vectors with similar coordinates, which may find applications in classification problems of symbolic objects under suitably chosen metrics. In this manuscript, we address the resolvability of Jaccard spaces, i.e., metric spaces of the form $(2^X,\text{Jac})$, where $2^X$ is the power set of a finite set $X$, and $\text{Jac}$ is the Jaccard distance between subsets of $X$. Specifically, for different $a,b\in 2^X$, $\text{Jac}(a,b)=\frac{|a\Delta b|}{|a\cup b|}$, where $|\cdot|$ denotes size (i.e., cardinality) and $\Delta$ denotes the symmetric difference of sets. We combine probabilistic and linear algebra arguments to construct highly likely but nearly optimal (i.e., of minimal size) resolving sets of $(2^X,\text{Jac})$. In particular, we show that the metric dimension of $(2^X,\text{Jac})$, i.e., the minimum size of a resolving set of this space, is $\Theta(|X|/\ln|X|)$.
翻译:度量空间中的一个子集被称为可分辨集,若该空间中的每个点都能通过其到子集中各点的距离唯一刻画。特别地,可分辨集可用于将抽象度量空间中的点表示为欧几里得向量。重要的是,由于三角不等式,空间中距离相近的点将被表示为具有相似坐标的向量,这一特性在适当选择度量的符号对象分类问题中具有潜在应用。本文研究Jaccard空间的可分辨性,即形如$(2^X,\text{Jac})$的度量空间,其中$2^X$是有限集合$X$的幂集,$\text{Jac}$是$X$子集间的Jaccard距离。具体而言,对任意不同的$a,b\in 2^X$,有$\text{Jac}(a,b)=\frac{|a\Delta b|}{|a\cup b|}$,这里$|\cdot|$表示基数(即元素个数),$\Delta$表示集合的对称差。我们综合运用概率论与线性代数方法,构造了$(2^X,\text{Jac})$中高度可能但近乎最优(即基数最小)的可分辨集。特别地,我们证明$(2^X,\text{Jac})$的度量维度(即该空间可分辨集的最小基数)为$\Theta(|X|/\ln|X|)$。