We present a generalist position control policy capable of controlling arbitrary multirotor configurations of a certain rotor count (e.g., hexarotors or quadrotors) with a single set of network weights. The policy is conditioned on a physics-grounded embodiment descriptor: a mass and inertia-normalized control allocation matrix that captures how mass-normalized motor thrusts generate linear and angular accelerations in the body-frame. To train the policy, we sample from a broad distribution of arbitrary multirotor configurations, including non-planar and asymmetric systems, and optimize a single, compact network using Proximal Policy Optimization. Training requires only five minutes on an RTX 3090 GPU using a custom NVIDIA Warp-based dynamics simulator. Through extensive simulation experiments, we show that embodiment conditioning enables robust generalist control across arbitrary morphologies. We demonstrate zero-shot real-world transfer of this generalist policy on three diverse hexarotor systems, including a planar robot, a partially symmetric non-planar system, and a random asymmetric, non-planar configuration.
翻译:我们提出了一种通用位置控制策略,该策略能够通过单一网络权重集控制任意具有特定旋翼数量(例如六旋翼或四旋翼)的多旋翼构型。该策略以物理驱动的具身描述符为条件:一个质量与惯性归一化的控制分配矩阵,该矩阵捕捉了质量归一化的电机推力如何在机体坐标系中产生线性和角加速度。为训练该策略,我们从任意多旋翼构型(包括非平面和不对称系统)的广泛分布中进行采样,并利用近端策略优化算法优化单个紧凑网络。训练仅需在基于自定义NVIDIA Warp动力学模拟器的RTX 3090 GPU上五分钟完成。通过大量仿真实验,我们证明了具身条件化能够在任意形态中实现鲁棒的通用控制。我们在三种不同的六旋翼系统(包括平面机器人、部分对称的非平面系统以及随机不对称非平面构型)上展示了该通用策略的零样本实际迁移能力。