In this paper, we develop a novel high-dimensional coefficient estimation procedure based on high-frequency data. Unlike usual high-dimensional regression procedure such as LASSO, we additionally handle the heavy-tailedness of high-frequency observations as well as time variations of coefficient processes. Specifically, we employ Huber loss and truncation scheme to handle heavy-tailed observations, while $\ell_{1}$-regularization is adopted to overcome the curse of dimensionality under a sparse coefficient structure. To account for the time-varying coefficient, we estimate local high-dimensional coefficients which are biased estimators due to the $\ell_{1}$-regularization. Thus, when estimating integrated coefficients, we propose a debiasing scheme to enjoy the law of large number property and employ a thresholding scheme to further accommodate the sparsity of the coefficients. We call this Robust thrEsholding Debiased LASSO (RED-LASSO) estimator. We show that the RED-LASSO estimator can achieve a near-optimal convergence rate with only finite $\gamma$th moment for any $\gamma>2$. In the empirical study, we apply the RED-LASSO procedure to the high-dimensional integrated coefficient estimation using high-frequency trading data.
翻译:本文基于高频数据提出了一种新颖的高维系数估计方法。与LASSO等常规高维回归方法不同,我们额外处理了高频观测值的重尾性以及系数过程的时变性。具体而言,我们采用Huber损失和截断机制处理重尾观测值,同时引入$\ell_{1}$正则化以克服稀疏系数结构下的维度灾难。为刻画时变系数,我们估计了局部高维系数,但由于$\ell_{1}$正则化这些系数存在偏差。因此,在估计积分系数时,我们提出一种去偏机制以利用大数定律性质,并采用阈值机制进一步适应系数的稀疏性。我们将此方法命名为稳健阈值去偏LASSO(RED-LASSO)估计量。我们证明RED-LASSO估计量在仅需有限$\gamma$阶矩(任意$\gamma>2$)的条件下能达到近最优收敛速率。在实证研究中,我们将RED-LASSO方法应用于基于高频交易数据的高维积分系数估计。