The independence of noise and covariates is a standard assumption in online linear regression and linear bandit literature. This assumption and the following analysis are invalid in the case of endogeneity, i.e., when the noise and covariates are correlated. In this paper, we study the online setting of instrumental variable (IV) regression, which is widely used in economics to tackle endogeneity. Specifically, we analyse and upper bound regret of Two-Stage Least Squares (2SLS) approach to IV regression in the online setting. Our analysis shows that Online 2SLS (O2SLS) achieves $O(d^2 \log^2 T)$ regret after $T$ interactions, where d is the dimension of covariates. Following that, we leverage the O2SLS as an oracle to design OFUL-IV, a linear bandit algorithm. OFUL-IV can tackle endogeneity and achieves $O(d \sqrt{T} \log T)$ regret. For datasets with endogeneity, we experimentally demonstrate that O2SLS and OFUL-IV incur lower regrets than the state-of-the-art algorithms for both the online linear regression and linear bandit settings.
翻译:噪声与协变量的独立性是线性回归和线性强盗文献中的标准假设。该假设及其后续分析在内生性情况下失效,即噪声与协变量相关时。本文研究经济学中广泛用于解决内生性问题的工具变量回归的在线设定。具体而言,我们分析并给出了在线环境下两阶段最小二乘方法进行工具变量回归的遗憾上界。分析表明,在线两阶段最小二乘在$T$轮交互后达到$O(d^2 \log^2 T)$的遗憾值,其中$d$为协变量维度。在此基础上,我们利用O2SLS作为预言机设计了线性强盗算法OFUL-IV。该算法能够处理内生性问题,并实现$O(d \sqrt{T} \log T)$的遗憾值。针对存在内生性的数据集,实验证明O2SLS和OFUL-IV在在线线性回归和线性强盗设定下的遗憾值均低于当前最优算法。