Deep learning approaches are becoming increasingly attractive for equation discovery. We show the advantages and disadvantages of using neural-guided equation discovery by giving an overview of recent papers and the results of experiments using our modular equation discovery system MGMT ($\textbf{M}$ulti-Task $\textbf{G}$rammar-Guided $\textbf{M}$onte-Carlo $\textbf{T}$ree Search for Equation Discovery). The system uses neural-guided Monte-Carlo Tree Search (MCTS) and supports both supervised and reinforcement learning, with a search space defined by a context-free grammar. We summarize seven desirable properties of equation discovery systems, emphasizing the importance of embedding tabular data sets for such learning approaches. Using the modular structure of MGMT, we compare seven architectures (among them, RNNs, CNNs, and Transformers) for embedding tabular datasets on the auxiliary task of contrastive learning for tabular data sets on an equation discovery task. For almost all combinations of modules, supervised learning outperforms reinforcement learning. Moreover, our experiments indicate an advantage of using grammar rules as action space instead of tokens. Two adaptations of MCTS -- risk-seeking MCTS and AmEx-MCTS -- can improve equation discovery with that kind of search.
翻译:深度学习方法在方程发现领域正变得越来越具有吸引力。通过综述近期文献并展示我们模块化方程发现系统MGMT($\textbf{M}$ulti-Task $\textbf{G}$rammar-Guided $\textbf{M}$onte-Carlo $\textbf{T}$ree Search for Equation Discovery)的实验结果,我们阐述了使用神经引导方程发现的优势与局限。该系统采用神经引导蒙特卡洛树搜索(MCTS),支持监督学习与强化学习,其搜索空间由上下文无关文法定义。我们总结了方程发现系统的七项理想特性,并强调了为此类学习方法嵌入表格数据集的重要性。利用MGMT的模块化结构,我们在方程发现任务中,针对表格数据集的对比学习辅助任务,比较了七种架构(其中包括RNN、CNN和Transformer)对表格数据集的嵌入效果。在几乎所有模块组合中,监督学习均优于强化学习。此外,实验表明使用文法规则作为动作空间优于使用词元。MCTS的两种改进策略——风险寻求型MCTS与AmEx-MCTS——能够提升此类搜索的方程发现性能。