Density deconvolution deals with the estimation of the probability density function $f$ of a random signal from $n\geq1$ data observed with independent and known additive random noise. This is a classical problem in statistics, for which frequentist and Bayesian nonparametric approaches are available to estimate $f$ in static or batch domains. In this paper, we consider the problem of density deconvolution in a streaming or online domain, and develop a principled sequential approach to estimate $f$. By relying on a quasi-Bayesian sequential (learning) model for the data, often referred to as Newton's algorithm, we obtain a sequential deconvolution estimate $f_{n}$ of $f$ that is of easy evaluation, computationally efficient, and with constant computational cost as data increase, which is desirable for streaming data. In particular, local and uniform Gaussian central limit theorems for $f_{n}$ are established, leading to asymptotic credible intervals and bands for $f$, respectively. We provide the sequential deconvolution estimate $f_{n}$ with large sample asymptotic guarantees under the quasi-Bayesian sequential model for the data, proving a merging with respect to the direct density estimation problem, and also under a ``true" frequentist model for the data, proving consistency. An empirical validation of our methods is presented on synthetic and real data, also comparing with respect to a kernel approach and a Bayesian nonparametric approach with a Dirichlet process mixture prior.
翻译:密度解卷积处理从观测到的$n\geq1$个数据中估计随机信号的概率密度函数$f$的问题,这些数据受到独立且已知的加性随机噪声干扰。这是统计学中的一个经典问题,在静态或批处理领域已有频率学派和贝叶斯非参数方法可用于估计$f$。本文考虑在流式或在线领域中的密度解卷积问题,并发展了一种原则性的序列方法来估计$f$。通过依赖一种通常被称为牛顿算法的数据拟贝叶斯序列(学习)模型,我们获得了一个易于评估、计算高效且计算成本随数据增加保持恒定的序列解卷积估计$f_{n}$,这对于流式数据是理想的。特别地,我们建立了$f_{n}$的局部和一致高斯中心极限定理,分别导出了$f$的渐近可信区间和可信带。我们在数据的拟贝叶斯序列模型下为序列解卷积估计$f_{n}$提供了大样本渐近保证,证明了其相对于直接密度估计问题的融合性;同时也在数据的“真实”频率学派模型下证明了其相合性。我们在合成数据和真实数据上展示了方法的实证验证,并与核方法以及采用狄利克雷过程混合先验的贝叶斯非参数方法进行了比较。