A common pipeline in functional data analysis is to first convert the discretely observed data to smooth functions, and then represent the functions by a finite-dimensional vector of coefficients summarizing the information. Existing methods for data smoothing and dimensional reduction mainly focus on learning the linear mappings from the data space to the representation space, however, learning only the linear representations may not be sufficient. In this study, we propose to learn the nonlinear representations of functional data using neural network autoencoders designed to process data in the form it is usually collected without the need of preprocessing. We design the encoder to employ a projection layer computing the weighted inner product of the functional data and functional weights over the observed timestamp, and the decoder to apply a recovery layer that maps the finite-dimensional vector extracted from the functional data back to functional space using a set of predetermined basis functions. The developed architecture can accommodate both regularly and irregularly spaced data. Our experiments demonstrate that the proposed method outperforms functional principal component analysis in terms of prediction and classification, and maintains superior smoothing ability and better computational efficiency in comparison to the conventional autoencoders under both linear and nonlinear settings.
翻译:函数数据分析的常见流程是先将离散观测数据平滑为连续函数,再通过有限维系数向量对函数信息进行压缩表征。现有数据平滑与降维方法主要聚焦于从数据空间到表征空间的线性映射学习,但仅依赖线性表征可能不足以应对复杂场景。本研究提出利用神经网络自编码器学习函数数据的非线性表征,该架构可直接处理原始采集形式的数据,无需预处理步骤。编码器采用投影层计算函数数据与函数权重在观测时间戳上的加权内积,解码器则通过恢复层利用预定义基函数将函数数据提取的有限维向量映射回函数空间。所提出的模型可兼容规则与非规则间隔数据。实验表明,相较于线性与非线性环境下的传统自编码器,本方法在预测与分类任务中优于函数主成分分析,同时保持更优的平滑能力与更高的计算效率。