We initiate a study of a new model of property testing that is a hybrid of testing properties of distributions and testing properties of strings. Specifically, the new model refers to testing properties of distributions, but these are distributions over huge objects (i.e., very long strings). Accordingly, the model accounts for the total number of local probes into these objects (resp., queries to the strings) as well as for the distance between objects (resp., strings), and the distance between distributions is defined as the earth mover's distance with respect to the relative Hamming distance between strings. We study the query complexity of testing in this new model, focusing on three directions. First, we try to relate the query complexity of testing properties in the new model to the sample complexity of testing these properties in the standard distribution testing model. Second, we consider the complexity of testing properties that arise naturally in the new model (e.g., distributions that capture random variations of fixed strings). Third, we consider the complexity of testing properties that were extensively studied in the standard distribution testing model: Two such cases are uniform distributions and pairs of identical distributions.
翻译:我们启动了一种新的性质测试模型研究,该模型融合了分布性质测试与字符串性质测试。具体而言,新模型涉及分布的性质测试,但这些分布覆盖的是巨大对象(即非常长的字符串)。因此,该模型考虑了进入这些对象的局部探测总数(对应字符串的查询次数)以及对象(即字符串)间的距离,而分布间的距离则定义为基于字符串相对汉明距离的地球移动距离。我们在此新模型中研究测试的查询复杂度,聚焦于三个方向。首先,尝试将新模型中性质测试的查询复杂度与标准分布测试模型中这些性质测试的样本复杂度建立关联。其次,考虑新模型中自然产生的性质(例如,捕获固定字符串随机变异的分布)的测试复杂度。第三,分析标准分布测试模型中已被广泛研究的性质测试复杂度:两个典型案例是均匀分布和完全相同分布对。