AirDreamer: Generalist Drone Navigation with World Models

Abstract (EN)

Navigating a drone in unseen and cluttered environments requires reliable generalization to unseen scene layouts and understanding of environmental structure relative to the robot's capabilities. Previous methods, which assume the same environment configuration, often rely heavily on human-designed perception pipelines and predefined rules to guide the robot toward the target. This process is environment-dependent and generalizes poorly across environments. Inspired by animal navigation behavior, we design a navigation framework that navigates with a reinforcement-learning-based policy on top of a world-model-based environment understanding to overcome these issues. In addition, a sparse reward function without hand-crafted shaping terms is designed to avoid local minima traps and encourage yaw control behaviors. In simulation and on real drones, our method exhibits emergent capabilities for navigating complex, unseen environments and escaping local optima where other methods fail. In challenging maps, it achieves a 5.3% higher navigation success rate than best baseline. Furthermore, the proposed framework achieves effective sim-to-real transfer without any tuning during deployment. The code will be publicly available.

摘要 (ZH)

在未知且杂乱的环境中导航无人机，需要可靠地泛化至未见过的场景布局，并理解与机器人能力相关的环境结构。先前的方法假设环境配置相同，通常严重依赖人工设计的感知流水线和预定义规则来引导机器人到达目标。这一过程依赖于特定环境，且在不同环境间泛化能力差。受动物导航行为启发，我们设计了一个导航框架，该框架在基于世界模型的环境理解之上，采用基于强化学习的策略进行导航，以克服上述问题。此外，我们设计了一种无需人工设计成形项的稀疏奖励函数，以避免局部最优陷阱并鼓励偏航控制行为。在仿真和真实无人机实验中，我们的方法展现出了在复杂未知环境中导航以及逃脱其他方法失败的局部最优点的涌现能力。在具有挑战性的地图中，其导航成功率比最优基线高出5.3%。此外，所提出的框架在部署过程中无需任何调参即可实现有效的仿真到真实迁移。代码将公开提供。

← Back