This study presents AutoOpt-11k, a unique image dataset of over 11,000 handwritten and printed mathematical optimization models corresponding to single-objective, multi-objective, multi-level, and stochastic optimization problems exhibiting various types of complexities such as non-linearity, non-convexity, non-differentiability, discontinuity, and high-dimensionality. The labels consist of the LaTeX representation for all the images and modeling language representation for a subset of images. The dataset is created by 25 experts following ethical data creation guidelines and verified in two-phases to avoid errors. Further, we develop AutoOpt framework, a machine learning based automated approach for solving optimization problems, where the user just needs to provide an image of the formulation and AutoOpt solves it efficiently without any further human intervention. AutoOpt framework consists of three Modules: (i) M1 (Image_to_Text)- a deep learning model performs the Mathematical Expression Recognition (MER) task to generate the LaTeX code corresponding to the optimization formulation in image; (ii) M2 (Text_to_Text)- a small-scale fine-tuned LLM generates the PYOMO script (optimization modeling language) from LaTeX code; (iii) M3 (Optimization)- a Bilevel Optimization based Decomposition (BOBD) method solves the optimization formulation described in the PYOMO script. We use AutoOpt-11k dataset for training and testing of deep learning models employed in AutoOpt. The deep learning model for MER task (M1) outperforms ChatGPT, Gemini and Nougat on BLEU score metric. BOBD method (M3), which is a hybrid approach, yields better results on complex test problems compared to common approaches, like interior-point algorithm and genetic algorithm.
翻译:本研究提出了AutoOpt-11k,这是一个独特的图像数据集,包含超过11,000个手写和打印的数学优化模型,对应单目标、多目标、多层级和随机优化问题,并展现出非线性、非凸性、不可微性、不连续性及高维性等多种复杂性。数据标签包含所有图像的LaTeX表示以及部分图像的建模语言表示。该数据集由25位专家遵循伦理数据创建准则构建,并经过两阶段验证以避免错误。此外,我们开发了AutoOpt框架,这是一种基于机器学习的自动化优化问题求解方法,用户仅需提供问题公式的图像,AutoOpt即可高效求解而无需任何进一步人工干预。AutoOpt框架包含三个模块:(i) M1(图像转文本)——一个深度学习模型执行数学表达式识别任务,生成与图像中优化公式对应的LaTeX代码;(ii) M2(文本转文本)——一个小规模微调的大语言模型从LaTeX代码生成PYOMO脚本(优化建模语言);(iii) M3(优化求解)——一种基于双层优化的分解方法,求解PYOMO脚本中描述的优化公式。我们使用AutoOpt-11k数据集对AutoOpt中采用的深度学习模型进行训练和测试。用于数学表达式识别任务的深度学习模型在BLEU分数指标上优于ChatGPT、Gemini和Nougat。BOBD方法作为一种混合方法,在复杂测试问题上相比内点算法和遗传算法等常见方法取得了更好的结果。