Hybrid Smoothing for Anomaly Detection in Time Series

Many industrial and engineering processes monitored as times series have smooth trends that indicate normal behavior and occasionally anomalous patterns that can indicate a problem. This kind of behavior can be modeled by a smooth trend such as a spline or Gaussian process and a disruption based on a sparser representation. Our approach is to expand the process signal into two sets of basis functions: one set uses $L_2$ penalties on the coefficients and the other set uses $L_1$ penalties to control sparsity. From a frequentist perspective, this results in a hybrid smoother that combines cubic smoothing splines and the LASSO, and as a Bayesian hierarchical model (BHM), this is equivalent to priors giving a Gaussian process and a Laplace distribution for anomaly coefficients. For the hybrid smoother we propose two new ways of determining the penalty parameters that use effective degrees of freedom and contrast this with the BHM that uses loosely informative inverse gamma priors. Several reformulations are used to make sampling the BHM posterior more efficient including some novel features in orthogonalizing and regularizing the model basis functions. This methodology is motivated by a substantive application, monitoring the water treatment process for the Denver Metropolitan area. We also test the methods with a Monte Carlo study designed around the kind of anomalies expected in this application. Both the hybrid smoother and the full BHM give comparable results with small false positive and false negative rates. Besides being successful in the water treatment application, this work can be easily extended to other Gaussian process models and other features that represent process disruptions.

翻译：许多工业和工程过程中监测的时间序列具有平滑趋势，这表明正常运行状态，偶尔出现异常模式则可能指示问题。这种特性可以通过平滑趋势（如样条或高斯过程）以及基于稀疏表示的扰动来建模。我们的方法将过程信号展开为两组基函数：一组对系数施加$L_2$惩罚，另一组使用$L_1$惩罚控制稀疏性。从频率学派角度看，这会产生一种结合三次平滑样条与LASSO的混合平滑器；而作为贝叶斯分层模型（BHM），则等价于为异常系数赋予高斯过程先验与拉普拉斯分布先验。针对混合平滑器，我们提出了两种基于有效自由度确定惩罚参数的新方法，并将其与使用弱信息逆伽马先验的BHM进行对比。我们通过多种重新表述（包括模型基函数的正交化和正则化等创新特性）提高了BHM后验采样的效率。该研究方法源于实际应用需求——监测丹佛大都会区的水处理过程。我们还针对该应用中预期的异常类型设计了蒙特卡洛研究来测试方法。混合平滑器与完整BHM均取得了可比较的结果，误报率和漏报率较低。除在水处理应用中的成功验证外，本工作可轻松扩展至其他高斯过程模型及表征过程扰动的其他特征。