Hybrid Machine Learning techniques in the management of harmful algal blooms impact

Harmful algal blooms (HABs) are episodes of high concentrations of algae that are potentially toxic for human consumption. Mollusc farming can be affected by HABs because, as filter feeders, they can accumulate high concentrations of marine biotoxins in their tissues. To avoid the risk to human consumption, harvesting is prohibited when toxicity is detected. At present, the closure of production areas is based on expert knowledge and the existence of a predictive model would help when conditions are complex and sampling is not possible. Although the concentration of toxin in meat is the method most commonly used by experts in the control of shellfish production areas, it is rarely used as a target by automatic prediction models. This is largely due to the irregularity of the data due to the established sampling programs. As an alternative, the activity status of production areas has been proposed as a target variable based on whether mollusc meat has a toxicity level below or above the legal limit. This new option is the most similar to the actual functioning of the control of shellfish production areas. For this purpose, we have made a comparison between hybrid machine learning models like Neural-Network-Adding Bootstrap (BAGNET) and Discriminative Nearest Neighbor Classification (SVM-KNN) when estimating the state of production areas. The study has been carried out in several estuaries with different levels of complexity in the episodes of algal blooms to demonstrate the generalization capacity of the models in bloom detection. As a result, we could observe that, with an average recall value of 93.41% and without dropping below 90% in any of the estuaries, BAGNET outperforms the other models both in terms of results and robustness.

翻译：有害藻华（HABs）是指藻类浓度过高、可能对人类食用产生毒性的现象。由于贝类作为滤食性生物，其组织中会积聚高浓度的海洋生物毒素，因此贝类养殖易受有害藻华影响。为避免人类食用风险，一旦检测到毒性，即禁止捕捞。目前，生产区域的关闭基于专家经验，而建立预测模型将有助于在条件复杂、无法采样时提供决策支持。尽管肉中毒素浓度是专家在贝类生产区域控制中最常用的方法，但自动预测模型很少将其作为目标变量。这主要是由于既定采样计划导致的数据不规律性。作为替代方案，我们提出将生产区域的活动状态作为目标变量，基于贝类肉毒性水平是否超过法定限值。这一新选项最接近贝类生产区域控制的实际运作。为此，我们比较了混合机器学习模型，如神经网络增强自助法（BAGNET）和判别最近邻分类（SVM-KNN）在估计生产区域状态时的性能。研究在多个具有不同藻华爆发复杂程度的河口区域进行，以验证模型在藻华检测中的泛化能力。结果表明，BAGNET的平均召回率为93.41%，且在所有河口区域均未低于90%，在结果和鲁棒性方面均优于其他模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日