In the realm of high-dimensional data analysis, the estimation of covariance matrices is a fundamental task, and this holds true for interval-valued data as well. However, there is no unified definition for the covariance matrix of interval-valued data, let alone established estimation methods in high-dimensional settings. This paper presents a novel approach to estimating covariance matrices for high-dimensional interval-valued data while ensuring positive definiteness. We begin by assuming that the upper and lower bounds of interval-valued variables share the same dependency structure. Based on this assumption, we extend the classical soft-thresholding covariance matrix estimator to the interval-valued scenario, referred to as the Interval-valued Soft-Thresholding (IST) estimator. Subsequently, to ensure the positive definiteness of the estimator, we impose a positive definiteness constraint on the IST estimator. We derive an alternating direction method to solve the proposed problem and establish its convergence. Under some very mild conditions, we develop a non-asymptotic statistical theory for the proposed estimator. Simulation studies and applications to high-frequency financial data from the CSI 300 Index demonstrated the effectiveness of the proposed estimator.
翻译:在高维数据分析领域,协方差矩阵估计是一项基础任务,对于区间值数据同样如此。然而,区间值数据协方差矩阵尚无统一定义,更遑论高维环境下的成熟估计方法。本文提出一种新颖方法,用于估计高维区间值数据的协方差矩阵,同时确保其正定性。我们首先假设区间值变量的上下界具有相同的依赖结构。基于这一假设,我们将经典的软阈值协方差矩阵估计器扩展到区间值情形,称为区间值软阈值(IST)估计器。随后,为确保估计器的正定性,我们对IST估计器施加正定性约束。我们推导出一种交替方向法来求解所提出的问题,并建立其收敛性。在非常温和的条件下,我们为所提估计器建立了非渐近统计理论。模拟研究以及基于沪深300指数高频金融数据的应用证明了所提估计器的有效性。