We propose a new analysis framework for clustering $M$ items into an unknown number of $K$ distinct groups using noisy and actively collected responses. At each time step, an agent is allowed to query pairs of items and observe bandit binary feedback. If the pair of items belongs to the same (resp.\ different) cluster, the observed feedback is $1$ with probability $p>1/2$ (resp.\ $q<1/2$). Leveraging the ubiquitous change-of-measure technique, we establish a fundamental lower bound on the expected number of queries needed to achieve a desired confidence in the clustering accuracy, formulated as a sup-inf optimization problem. Building on this theoretical foundation, we design an asymptotically optimal algorithm in which the stopping criterion involves an empirical version of the inner infimum -- the Generalized Likelihood Ratio (GLR) statistic -- being compared to a threshold. We develop a computationally feasible variant of the GLR statistic and show that its performance gap to the lower bound can be accurately empirically estimated and remains within a constant multiple of the lower bound.
翻译:我们提出了一种新的分析框架,用于将 $M$ 个项目聚类到未知数量 $K$ 个不同组中,该框架利用有噪声且主动收集的响应。在每个时间步,允许智能体查询项目对并观察赌博机式二元反馈。如果项目对属于同一(或不同)簇,则观测到的反馈为 $1$ 的概率为 $p>1/2$(或 $q<1/2$)。利用普遍存在的测度变换技术,我们建立了实现所需聚类置信度所需查询次数期望值的基本下界,该下界被表述为一个上确界-下确界优化问题。基于此理论基础,我们设计了一种渐进最优算法,其停止准则涉及将内层下确界的经验版本——广义似然比统计量——与一个阈值进行比较。我们开发了广义似然比统计量的一种计算可行的变体,并证明其与下界的性能差距可以通过经验准确估计,且保持在不超过下界常数倍的范围内。