In this study, explainable machine learning techniques are applied to predict the toxicity of mussels in the Gulf of Trieste (Adriatic Sea) caused by harmful algal blooms. By analysing a newly created 28-year dataset containing records of toxic phytoplankton in mussel farming areas and toxin concentrations in mussels (Mytilus galloprovincialis), we train and evaluate the performance of ML models to accurately predict diarrhetic shellfish poisoning (DSP) events. The random forest model provided the best prediction of positive toxicity results based on the F1 score. Explainability methods such as permutation importance and SHAP identified key species (Dinophysis fortii and D. caudata) and environmental factors (salinity, river discharge and precipitation) as the best predictors of DSP outbreaks. These findings are important for improving early warning systems and supporting sustainable aquaculture practices.
翻译:本研究应用可解释机器学习技术,针对的里雅斯特湾(亚得里亚海)因有害藻华引发的贻贝毒性事件进行预测。通过分析新构建的涵盖28年监测数据的数据库(包含贻贝养殖区有毒浮游植物记录及地中海贻贝Mytilus galloprovincialis体内毒素浓度数据),我们训练并评估了多种机器学习模型对腹泻性贝毒事件的预测性能。基于F1评分指标,随机森林模型对阳性毒性结果的预测效果最优。通过置换重要性分析与SHAP可解释性方法,研究揭示了关键物种(渐尖鳍藻Dinophysis fortii与具尾鳍藻D. caudata)及环境因子(盐度、河流径流量与降水量)是预测腹泻性贝毒爆发的最佳指示指标。该发现对完善早期预警系统、支撑可持续水产养殖实践具有重要价值。