Machine learning classification problems are widespread in bioinformatics, but the technical knowledge required to perform model training, optimization, and inference can prevent researchers from utilizing this technology. This article presents an automated tool for machine learning classification problems to simplify the process of training models and producing results while providing informative visualizations and insights into the data. This tool supports both binary and multiclass classification problems, and it provides access to a variety of models and methods. Synthetic data can be generated within the interface to fill missing values, balance class labels, or generate entirely new datasets. It also provides support for feature evaluation and generates explainability scores to indicate which features influence the output the most. We present CLASSify, an open-source tool for simplifying the user experience of solving classification problems without the need for knowledge of machine learning.
翻译:机器学习分类问题在生物信息学中广泛存在,但执行模型训练、优化和推理所需的技术知识可能阻碍研究人员利用这一技术。本文介绍了一种用于机器学习分类问题的自动化工具,旨在简化模型训练和结果生成的过程,同时提供信息丰富的可视化结果与数据洞察。该工具支持二分类和多分类问题,并提供多种模型与方法的访问权限。用户可通过界面生成合成数据,以填充缺失值、平衡类别标签或生成全新数据集。此外,该工具支持特征评估,并生成可解释性评分以指示哪些特征对输出影响最大。我们推出CLASSify这一开源工具,旨在简化解决分类问题的用户体验,无需具备机器学习知识。