This paper introduces a novel methodology for Feature Selection for Functional Classification, FSFC, that addresses the challenge of jointly performing feature selection and classification of functional data in scenarios with categorical responses and multivariate longitudinal features. FSFC tackles a newly defined optimization problem that integrates logistic loss and functional features to identify the most crucial variables for classification. To address the minimization procedure, we employ functional principal components and develop a new adaptive version of the Dual Augmented Lagrangian algorithm. The computational efficiency of FSFC enables handling high-dimensional scenarios where the number of features may considerably exceed the number of statistical units. Simulation experiments demonstrate that FSFC outperforms other machine learning and deep learning methods in computational time and classification accuracy. Furthermore, the FSFC feature selection capability can be leveraged to significantly reduce the problem's dimensionality and enhance the performances of other classification algorithms. The efficacy of FSFC is also demonstrated through a real data application, analyzing relationships between four chronic diseases and other health and demographic factors.
翻译:本文提出了一种用于函数分类特征选择(FSFC)的新方法,旨在解决在分类响应与多元纵向特征场景下,联合执行函数数据特征选择与分类的挑战。FSFC 求解一个新定义的优化问题,该问题整合了逻辑损失与函数特征,以识别分类任务中最关键的变量。为处理最小化过程,我们采用函数主成分分析,并开发了一种新的自适应对偶增广拉格朗日算法。FSFC 的计算效率使其能够处理高维场景,其中特征数量可能远超统计单元数量。仿真实验表明,FSFC 在计算时间和分类精度上均优于其他机器学习和深度学习方法。此外,FSFC 的特征选择能力可用于显著降低问题维度,并提升其他分类算法的性能。通过一项真实数据应用——分析四种慢性疾病与其他健康及人口统计学因素之间的关系,FSFC 的有效性也得到了验证。