Level set estimation (LSE), the problem of identifying the set of input points where a function takes value above (or below) a given threshold, is important in practical applications. When the function is expensive-to-evaluate and black-box, the \textit{straddle} algorithm, which is a representative heuristic for LSE based on Gaussian process models, and its extensions having theoretical guarantees have been developed. However, many of existing methods include a confidence parameter $\beta^{1/2}_t$ that must be specified by the user, and methods that choose $\beta^{1/2}_t$ heuristically do not provide theoretical guarantees. In contrast, theoretically guaranteed values of $\beta^{1/2}_t$ need to be increased depending on the number of iterations and candidate points, and are conservative and not good for practical performance. In this study, we propose a novel method, the \textit{randomized straddle} algorithm, in which $\beta_t$ in the straddle algorithm is replaced by a random sample from the chi-squared distribution with two degrees of freedom. The confidence parameter in the proposed method has the advantages of not needing adjustment, not depending on the number of iterations and candidate points, and not being conservative. Furthermore, we show that the proposed method has theoretical guarantees that depend on the sample complexity and the number of iterations. Finally, we confirm the usefulness of the proposed method through numerical experiments using synthetic and real data.
翻译:水平集估计(LSE)旨在识别函数值高于(或低于)给定阈值的输入点集合,在实际应用中具有重要意义。当函数评估成本高昂且为黑箱时,基于高斯过程模型的代表性启发式方法——\textit{跨越}算法及其具有理论保证的扩展方法已被提出。然而,现有方法大多包含一个需由用户指定的置信参数$\beta^{1/2}_t$,而启发式选择$\beta^{1/2}_t$的方法无法提供理论保证。相比之下,具有理论保证的$\beta^{1/2}_t$取值需根据迭代次数和候选点数量进行调整,通常较为保守且不利于实际性能。本研究提出一种新颖方法——\textit{随机跨越}算法,该算法将跨越算法中的$\beta_t$替换为来自自由度为二的卡方分布的随机样本。所提方法中的置信参数具有无需调整、不依赖迭代次数与候选点数量、非保守性等优势。此外,我们证明所提方法具有依赖于样本复杂度与迭代次数的理论保证。最后,通过合成数据与真实数据的数值实验验证了所提方法的有效性。