IDEA: Insensitive to Dynamics Mismatch via Effect Alignment for Sim-to-Real Transfer in Multi-Agent Control

Abstract (EN)

Complex multi-agent control tasks remain challenging for traditional rule-based and model-based approaches, motivating the adoption of learning-based methods. However, learning-based methods often struggle with sim-to-real transfer because they rely on accurate dynamics modeling or system identification and learn policies in low-level control spaces that are highly sensitive to dynamics mismatch, making them costly and fragile in complex environments. To address this issue, we propose a sim-to-real method for multi-agent control, which is insensitive to dynamics mismatch via effect alignment. Our method combines random environmental structure with discrete semantic actions through closed-loop control, elevating policy learning to a semantic abstraction level. Additionally, we develop an action synchronization mechanism that mitigates inter-agent action timing mismatches, thereby enhancing the temporal consistency of the system. Experiments on four multi-agent navigation tasks demonstrate that our method substantially improves training efficiency over mainstream transfer methods and achieves higher success rates in real-world scenarios, thereby improving the robustness and deployment stability of multi-agent systems under dynamics mismatch.

摘要 (ZH)

复杂的多智能体控制任务对传统的基于规则和基于模型的方法仍具挑战性，从而推动了基于学习方法的采用。然而，基于学习方法往往在仿真到现实的迁移中存在困难，因为它们依赖于精确的动力学建模或系统辨识，并在对动力学失配高度敏感的低级控制空间中学习策略，这使得它们在复杂环境中成本高昂且脆弱。为了解决这一问题，我们提出了一种对动力学失配不敏感的多智能体控制仿真到现实迁移方法，该方法通过效果对齐实现。我们的方法通过闭环控制将随机环境结构与离散语义动作相结合，将策略学习提升到语义抽象层面。此外，我们开发了一种动作同步机制，缓解了智能体间动作时间失配，从而增强了系统的时序一致性。在四个多智能体导航任务上的实验表明，我们的方法相比主流迁移方法显著提高了训练效率，并在真实场景中实现了更高的成功率，从而提升了多智能体系统在动力学失配下的鲁棒性和部署稳定性。

← Back