Metric-based summary statistics such as mean and covariance have been introduced in neural spike train space. They can properly describe template and variability in spike train data, but are often sensitive to outliers and expensive to compute. Recent studies also examine outlier detection and classification methods on point processes. These tools provide reasonable and efficient result, whereas the accuracy remains at a low level in certain cases. In this study, we propose to adopt a well-established notion of statistical depth to the spike train space. This framework can naturally define the median in a set of spike trains, which provides a robust description of the 'center' or 'template' of the observations. It also provides a principled method to identify 'outliers' in the data and classify data from different categories. We systematically compare the median with the state-of-the-art 'mean spike trains' in terms of robustness and efficiency. The performance of our novel outlier detection and classification tools will be compared with previous methods. The result shows the median has superior description for 'template' than the mean. Moreover, the proposed outlier detection and classification perform more accurately than previous methods. The advantages and superiority are well illustrated with simulations and real data.
翻译:在神经尖峰列车空间中,已引入了基于度量的汇总统计量(如均值和协方差)。它们能够恰当地描述尖峰列车数据的模板和变异性,但通常对离群值敏感且计算成本高昂。近期研究也探讨了点过程的离群值检测与分类方法。这些工具提供了合理且高效的结果,但在某些情况下其准确性仍处于较低水平。本研究提出将一种成熟的统计深度概念引入尖峰列车空间。该框架可自然定义尖峰列车集合的中位数,从而对观测数据的“中心”或“模板”进行稳健描述;同时,它也为识别数据中的“离群值”以及分类不同类别的数据提供了原理性方法。我们从稳健性和效率两方面系统比较了中位数与最新“均值尖峰列车”的差异。此外,将本文提出的新型离群值检测与分类工具的性能与先前方法进行对比。结果表明,中位数对“模板”的描述优于均值;并且,所提出的离群值检测与分类方法比原有方法具有更高的准确性。通过仿真实验和实际数据验证充分展示了其优越性与优势。