Smoothing splines are twice differentiable by construction, so they cannot capture potential discontinuities in the underlying signal. In this work, we consider a special case of the weak rod model of Blake and Zisserman (1987) that allows for discontinuities penalizing their number by a linear term. The corresponding estimates are cubic smoothing splines with discontinuities (CSSD) which serve as representations of piecewise smooth signals and facilitate exploratory data analysis. However, computing the estimates requires solving a non-convex optimization problem. So far, efficient and exact solvers exist only for a discrete approximation based on equidistantly sampled data. In this work, we propose an efficient solver for the continuous minimization problem with non-equidistantly sampled data. Its worst case complexity is quadratic in the number of data points, and if the number of detected discontinuities scales linearly with the signal length, we observe linear growth in runtime. This efficient algorithm allows to use cross validation for automatic selection of the hyperparameters within a reasonable time frame on standard hardware. We provide a reference implementation and supplementary material. We demonstrate the applicability of the approach for the aforementioned tasks using both simulated and real data.
翻译:光滑样条构建时天然具有二阶可导性,因此无法捕捉底层信号中潜在的不连续性。本研究考虑Blake与Zisserman(1987)弱杆模型的一种特例,该模型通过线性惩罚项允许信号存在不连续点。对应的估计量是具有不连续性的三次光滑样条(CSSD),可表征分段光滑信号并辅助探索性数据分析。然而,计算该估计量需解决非凸优化问题。目前仅针对基于等距采样数据的离散近似存在高效精确求解器。本文提出一种适用于非等距采样数据连续优化问题的高效求解器,其最坏情况复杂度为数据点数的二次方。当检测到的不连续点数量随信号长度线性增长时,我们观察到运行时间呈线性增长。该高效算法使得在标准硬件上可在合理时间内通过交叉验证自动选取超参数。我们提供参考实现代码与补充材料,并通过模拟数据与真实数据验证该方法在上述任务中的适用性。