On Solving the Rubik's Cube with Domain-Independent Planners Using Standard Representations

Rubik's Cube (RC) is a well-known and computationally challenging puzzle that has motivated AI researchers to explore efficient alternative representations and problem-solving methods. The ideal situation for planning here is that a problem be solved optimally and efficiently represented in a standard notation using a general-purpose solver and heuristics. The fastest solver today for RC is DeepCubeA with a custom representation, and another approach is with Scorpion planner with State-Action-Space+ (SAS+) representation. In this paper, we present the first RC representation in the popular PDDL language so that the domain becomes more accessible to PDDL planners, competitions, and knowledge engineering tools, and is more human-readable. We then bridge across existing approaches and compare performance. We find that in one comparable experiment, DeepCubeA (trained with 12 RC actions) solves all problems with varying complexities, albeit only 78.5% are optimal plans. For the same problem set, Scorpion with SAS+ representation and pattern database heuristics solves 61.50% problems optimally, while FastDownward with PDDL representation and FF heuristic solves 56.50% problems, out of which 79.64% of the plans generated were optimal. Our study provides valuable insights into the trade-offs between representational choice and plan optimality that can help researchers design future strategies for challenging domains combining general-purpose solving methods (planning, reinforcement learning), heuristics, and representations (standard or custom).

翻译：魔方（RC）是一个众所周知且计算上具有挑战性的谜题，它促使人工智能研究者探索高效的替代表示方法和问题求解策略。在此类规划问题中，理想的状况是：使用通用求解器和启发式算法，以标准符号对问题进行最优且高效的表示。目前最快的魔方求解器是采用自定义表示的DeepCubeA，另一种方法则使用基于状态-动作空间+（SAS+）表示的Scorpion规划器。本文首次提出了基于流行PDDL语言的魔方表示，使该领域更易于被PDDL规划器、竞赛和知识工程工具使用，并具有更高的可读性。我们随后衔接了现有方法，并比较了其性能。在一项可对比实验中，我们发现：采用12种魔方动作训练的DeepCubeA能解决所有复杂程度的问题，但仅有78.5%的方案为最优解。针对相同的问题集，使用SAS+表示和模式数据库启发式的Scorpion规划器能以61.50%的概率求得最优解，而采用PDDL表示和FF启发式的FastDownward规划器虽能解决56.50%的问题，但其生成方案中仅有79.64%为最优解。本研究揭示了表示选择与规划最优性之间的权衡关系，为研究人员设计针对复杂领域的未来策略（结合通用求解方法如规划、强化学习，启发式算法，以及标准或自定义表示）提供了宝贵见解。