In the context of large samples, a small number of individuals might spoil basic statistical indicators like the mean. It is difficult to detect automatically these atypical individuals, and an alternative strategy is using robust approaches. This paper focuses on estimating the geometric median of a random variable, which is a robust indicator of central tendency. In order to deal with large samples of data arriving sequentially, online stochastic Newton algorithms for estimating the geometric median are introduced and we give their rates of convergence. Since estimates of the median and those of the Hessian matrix can be recursively updated, we also determine confidences intervals of the median in any designated direction and perform online statistical tests.
翻译:在大样本背景下,少量个体可能会破坏均值等基本统计指标。这些异常个体难以自动检测,而另一种策略是采用稳健方法。本文聚焦于估计随机变量的几何中位数——一种稳健的中心趋势指标。为处理顺序到达的大规模数据样本,我们引入了用于估计几何中位数的在线随机牛顿算法,并给出了其收敛速率。由于中位数估计和黑塞矩阵估计均可递归更新,我们还能够在指定方向上确定中位数的置信区间,并执行在线统计检验。