Most linear experimental design problems assume homogeneous variance although heteroskedastic noise is present in many realistic settings. Let a learner have access to a finite set of measurement vectors $\mathcal{X}\subset \mathbb{R}^d$ that can be probed to receive noisy linear responses of the form $y=x^{\top}\theta^{\ast}+\eta$. Here $\theta^{\ast}\in \mathbb{R}^d$ is an unknown parameter vector, and $\eta$ is independent mean-zero $\sigma_x^2$-sub-Gaussian noise defined by a flexible heteroskedastic variance model, $\sigma_x^2 = x^{\top}\Sigma^{\ast}x$. Assuming that $\Sigma^{\ast}\in \mathbb{R}^{d\times d}$ is an unknown matrix, we propose, analyze and empirically evaluate a novel design for uniformly bounding estimation error of the variance parameters, $\sigma_x^2$. We demonstrate the benefits of this method with two adaptive experimental design problems under heteroskedastic noise, fixed confidence transductive best-arm identification and level-set identification and prove the first instance-dependent lower bounds in these settings. Lastly, we construct near-optimal algorithms and demonstrate the large improvements in sample complexity gained from accounting for heteroskedastic variance in these designs empirically.
翻译:大多数线性实验设计问题假设同质方差,尽管许多实际环境中存在异方差噪声。假设研究者可访问有限测量向量集 $\mathcal{X}\subset \mathbb{R}^d$,通过探测这些向量可接收形如 $y=x^{\top}\theta^{\ast}+\eta$ 的带噪线性响应。其中 $\theta^{\ast}\in \mathbb{R}^d$ 为未知参数向量,$\eta$ 为独立零均值 $\sigma_x^2$-次高斯噪声,由灵活异方差方差模型 $\sigma_x^2 = x^{\top}\Sigma^{\ast}x$ 定义。在 $\Sigma^{\ast}\in \mathbb{R}^{d\times d}$ 为未知矩阵的假设下,我们提出、分析并实证评估了一种新型实验设计,用于一致地界定方差参数 $\sigma_x^2$ 的估计误差。我们通过两个异方差噪声下的自适应实验设计问题(固定置信度传递最优臂识别和水平集识别)展示了该方法的优势,并证明了这些场景下首个实例依赖下界。最后,我们构建了近乎最优的算法,并通过实证表明在实验设计中考虑异方差可大幅提升样本效率。