In recent years, Reinforcement Learning (RL) has emerged as a powerful tool for solving a wide range of problems, including decision-making and genomics. The exponential growth of raw genomic data over the past two decades has exceeded the capacity of manual analysis, leading to a growing interest in automatic data analysis and processing. RL algorithms are capable of learning from experience with minimal human supervision, making them well-suited for genomic data analysis and interpretation. One of the key benefits of using RL is the reduced cost associated with collecting labeled training data, which is required for supervised learning. While there have been numerous studies examining the applications of Machine Learning (ML) in genomics, this survey focuses exclusively on the use of RL in various genomics research fields, including gene regulatory networks (GRNs), genome assembly, and sequence alignment. We present a comprehensive technical overview of existing studies on the application of RL in genomics, highlighting the strengths and limitations of these approaches. We then discuss potential research directions that are worthy of future exploration, including the development of more sophisticated reward functions as RL heavily depends on the accuracy of the reward function, the integration of RL with other machine learning techniques, and the application of RL to new and emerging areas in genomics research. Finally, we present our findings and conclude by summarizing the current state of the field and the future outlook for RL in genomics.
翻译:近年来,强化学习已成为解决决策制定和基因组学等广泛问题的强大工具。过去二十年间,原始基因组数据的指数级增长已超出人工分析的能力范围,这促使人们对自动化数据处理和分析的需求日益增加。强化学习算法能够在极少人工监督的情况下从经验中学习,因此特别适用于基因组数据的分析与解读。使用强化学习的主要优势之一是能够降低收集有标签训练数据的成本——这类数据是监督学习所必需的。尽管已有大量研究探讨机器学习在基因组学中的应用,但本综述专门聚焦于强化学习在基因调控网络、基因组组装和序列比对等不同基因组学研究领域中的使用。我们全面回顾了现有关于强化学习在基因组学中应用的技术研究,并着重分析了这些方法的优势与局限性。随后,我们讨论了值得未来探索的研究方向,包括开发更复杂的奖励函数(因为强化学习的性能高度依赖于奖励函数的准确性)、将强化学习与其他机器学习技术相融合,以及将强化学习应用于基因组学研究中新兴的前沿领域。最后,我们总结了研究结果,并梳理了当前该领域的发展状况以及强化学习在基因组学中的未来前景。