Multi-Layer Personalized Federated Learning for Mitigating Biases in Student Predictive Analytics

Conventional methods for student modeling, which involve predicting grades based on measured activities, struggle to provide accurate results for minority/underrepresented student groups due to data availability biases. In this paper, we propose a Multi-Layer Personalized Federated Learning (MLPFL) methodology that optimizes inference accuracy over different layers of student grouping criteria, such as by course and by demographic subgroups within each course. In our approach, personalized models for individual student subgroups are derived from a global model, which is trained in a distributed fashion via meta-gradient updates that account for subgroup heterogeneity while preserving modeling commonalities that exist across the full dataset. The evaluation of the proposed methodology considers case studies of two popular downstream student modeling tasks, knowledge tracing and outcome prediction, which leverage multiple modalities of student behavior (e.g., visits to lecture videos and participation on forums) in model training. Experiments on three real-world online course datasets show significant improvements achieved by our approach over existing student modeling benchmarks, as evidenced by an increased average prediction quality and decreased variance across different student subgroups. Visual analysis of the resulting students' knowledge state embeddings confirm that our personalization methodology extracts activity patterns clustered into different student subgroups, consistent with the performance enhancements we obtain over the baselines.

翻译：传统的学生建模方法通过测量学生活动来预测成绩，但由于数据可用性偏差，难以对少数群体/代表性不足的学生群体提供准确结果。本文提出了一种多层个性化联邦学习（MLPFL）方法，该方法针对不同层级的学生分组标准（如按课程划分以及按各课程内的人口统计子组划分）优化推理准确性。在我们的方法中，针对个别学生子组的个性化模型源自一个全局模型，该全局模型通过元梯度更新以分布式方式进行训练，在考虑子组异质性的同时，保留了整个数据集中存在的建模共性。对所提方法的评估考虑了两个流行的下游学生建模任务的案例研究：知识追踪和结果预测，这些任务在模型训练中利用了学生行为的多种模态（例如，访问讲座视频和参与论坛活动）。在三个真实世界在线课程数据集上的实验表明，我们的方法相较于现有学生建模基准取得了显著改进，这体现在平均预测质量的提高以及不同学生子组间方差的降低上。对所得学生知识状态嵌入的可视化分析证实，我们的个性化方法提取出的活动模式聚类于不同的学生子组，这与我们相较于基线模型所获得的性能提升相一致。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/