Harmful algal blooms (HABs) are episodes of high concentrations of algae that are potentially toxic for human consumption. Mollusc farming can be affected by HABs because, as filter feeders, they can accumulate high concentrations of marine biotoxins in their tissues. To avoid the risk to human consumption, harvesting is prohibited when toxicity is detected. At present, the closure of production areas is based on expert knowledge and the existence of a predictive model would help when conditions are complex and sampling is not possible. Although the concentration of toxin in meat is the method most commonly used by experts in the control of shellfish production areas, it is rarely used as a target by automatic prediction models. This is largely due to the irregularity of the data due to the established sampling programs. As an alternative, the activity status of production areas has been proposed as a target variable based on whether mollusc meat has a toxicity level below or above the legal limit. This new option is the most similar to the actual functioning of the control of shellfish production areas. For this purpose, we have made a comparison between hybrid machine learning models like Neural-Network-Adding Bootstrap (BAGNET) and Discriminative Nearest Neighbor Classification (SVM-KNN) when estimating the state of production areas. The study has been carried out in several estuaries with different levels of complexity in the episodes of algal blooms to demonstrate the generalization capacity of the models in bloom detection. As a result, we could observe that, with an average recall value of 93.41% and without dropping below 90% in any of the estuaries, BAGNET outperforms the other models both in terms of results and robustness.
翻译:有害藻华(HABs)是指藻类浓度过高、可能对人类食用产生毒性的现象。由于贝类作为滤食性生物,其组织中会积聚高浓度的海洋生物毒素,因此贝类养殖易受有害藻华影响。为避免人类食用风险,一旦检测到毒性,即禁止捕捞。目前,生产区域的关闭基于专家经验,而建立预测模型将有助于在条件复杂、无法采样时提供决策支持。尽管肉中毒素浓度是专家在贝类生产区域控制中最常用的方法,但自动预测模型很少将其作为目标变量。这主要是由于既定采样计划导致的数据不规律性。作为替代方案,我们提出将生产区域的活动状态作为目标变量,基于贝类肉毒性水平是否超过法定限值。这一新选项最接近贝类生产区域控制的实际运作。为此,我们比较了混合机器学习模型,如神经网络增强自助法(BAGNET)和判别最近邻分类(SVM-KNN)在估计生产区域状态时的性能。研究在多个具有不同藻华爆发复杂程度的河口区域进行,以验证模型在藻华检测中的泛化能力。结果表明,BAGNET的平均召回率为93.41%,且在所有河口区域均未低于90%,在结果和鲁棒性方面均优于其他模型。