A Configurable Library for Generating and Manipulating Maze Datasets

Michael Igorevich Ivanitskiy,Rusheb Shah,Alex F. Spies,Tilman Räuker,Dan Valentine,Can Rager,Lucia Quirke,Chris Mathwin,Guillaume Corlouer,Cecilia Diniz Behn,Samy Wu Fung

from arxiv, 9 pages, 5 figures, 1 table. Corresponding author: Michael Ivanitskiy ([email protected]). Code available at https://github.com/understanding-search/maze-dataset

Understanding how machine learning models respond to distributional shifts is a key research challenge. Mazes serve as an excellent testbed due to varied generation algorithms offering a nuanced platform to simulate both subtle and pronounced distributional shifts. To enable systematic investigations of model behavior on out-of-distribution data, we present $\texttt{maze-dataset}$, a comprehensive library for generating, processing, and visualizing datasets consisting of maze-solving tasks. With this library, researchers can easily create datasets, having extensive control over the generation algorithm used, the parameters fed to the algorithm of choice, and the filters that generated mazes must satisfy. Furthermore, it supports multiple output formats, including rasterized and text-based, catering to convolutional neural networks and autoregressive transformer models. These formats, along with tools for visualizing and converting between them, ensure versatility and adaptability in research applications.

翻译：理解机器学习模型如何应对分布偏移是一项关键研究挑战。迷宫凭借其多样化的生成算法，为模拟微妙与显著的分布偏移提供了精密的平台。为实现对模型在分布外数据上行为的系统性研究，我们提出了$\texttt{maze-dataset}$——一个集生成、处理与可视化迷宫求解任务数据集于一体的综合性库。借助该库，研究者可轻松创建数据集，并对生成算法、算法参数以及迷宫必须满足的过滤条件实施全面控制。此外，该库支持包括栅格化与文本格式在内的多种输出格式，可适配卷积神经网络与自回归Transformer模型。这些格式及其可视化与格式转换工具，确保了研究应用的灵活性与适应性。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日