Mycotoxin contamination poses a significant risk to cereal crop quality, food safety, and agricultural productivity. Accurate prediction of mycotoxin levels can support early intervention strategies and reduce economic losses. This study investigates the use of neural networks and transfer learning models to predict mycotoxin contamination in Irish oat crops as a multi-response prediction task. Our dataset comprises oat samples collected in Ireland, containing a mix of environmental, agronomic, and geographical predictors. Five modelling approaches were evaluated: a baseline multilayer perceptron (MLP), an MLP with pre-training, and three transfer learning models; TabPFN, TabNet, and FT-Transformer. Model performance was evaluated using regression (RMSE, $R^2$) and classification (AUC, F1) metrics, with results reported per toxin and on average. Additionally, permutation-based variable importance analysis was conducted to identify the most influential predictors across both prediction tasks. The transfer learning approach TabPFN provided the overall best performance, followed by the baseline MLP. Our variable importance analysis revealed that weather history patterns in the 90-day pre-harvest period were the most important predictors, alongside seed moisture content.
翻译:霉菌毒素污染对谷物作物质量、食品安全和农业生产率构成重大风险。准确预测霉菌毒素水平可支持早期干预策略并减少经济损失。本研究探讨使用神经网络和迁移学习模型,将爱尔兰燕麦作物中的霉菌毒素污染预测作为多响应预测任务。我们的数据集包含在爱尔兰收集的燕麦样本,涵盖环境、农艺和地理预测因子。评估了五种建模方法:基线多层感知器(MLP)、带预训练的MLP,以及三种迁移学习模型:TabPFN、TabNet和FT-Transformer。模型性能使用回归(RMSE、$R^2$)和分类(AUC、F1)指标进行评估,结果按毒素和平均值报告。此外,进行了基于排列的变量重要性分析,以确定在两个预测任务中影响最大的预测因子。迁移学习方法TabPFN提供了整体最佳性能,其次是基线MLP。我们的变量重要性分析表明,收获前90天内的天气历史模式是最重要的预测因子,其次是种子含水量。