Nowadays, more and more problems are dealing with data with one infinite continuous dimension: functional data. In this paper, we introduce the funLOCI algorithm which allows to identify functional local clusters or functional loci, i.e., subsets/groups of functions exhibiting similar behaviour across the same continuous subset of the domain. The definition of functional local clusters leverages ideas from multivariate and functional clustering and biclustering and it is based on an additive model which takes into account the shape of the curves. funLOCI is a three-step algorithm based on divisive hierarchical clustering. The use of dendrograms allows to visualize and to guide the searching procedure and the cutting thresholds selection. To deal with the large quantity of local clusters, an extra step is implemented to reduce the number of results to the minimum.
翻译:当前,越来越多的问题涉及具有一个无限连续维度的数据,即函数型数据。本文提出funLOCI算法,该算法能够识别函数型局部聚类(即函数型位点),指在域中同一连续子集上呈现相似行为的一组/子集函数。函数型局部聚类的定义借鉴了多元聚类、函数型聚类及双聚类的思想,并基于一个考虑曲线形态的加性模型。funLOCI是一个基于分裂式层次聚类的三步算法。通过树状图的可视化,可引导搜索过程并辅助切割阈值的选取。为应对大量局部聚类结果,算法额外引入精简步骤,将结果数量降至最低。