VibeAct: Vibration to Actions for Contact-Rich Reactive Robot Dexterity

Abstract (EN)

Dexterous manipulation depends on contact events that are fast, local, and often visually occluded. Piezoelectric microphones offer a compact and high-bandwidth way to sense these interactions, but the resulting vibro-acoustic signals are difficult to simulate faithfully enough for end-to-end sim-to-real policy learning on dexterous robot hands. We propose VibeAct, a framework that bridges real vibrotactile sensing and simulation-based reinforcement learning through a shared physical representation of contact and slip. In the real world, we embed piezoelectric microphones into a dexterous robot hand and collect vibro-acoustic data through teleoperation, then replay the recordings in a calibrated digital clone to automatically label per-finger contact and slip. A tactile estimator learns to predict contact and slip from real microphone waveforms, while manipulation policies are trained in simulation on the same representation computed directly from simulated contacts. This decoupling lets policies exploit rapid tactile feedback without simulating raw audio. Across five contact-rich tasks spanning regrasping, in-hand reorientation, and insertion, VibeAct consistently outperforms a proprioception-and-point-cloud baseline in simulation, with the largest gains on tasks requiring sustained reactive control, where the continuous slip-magnitude channel proves the most informative observation. The learned policies transfer to a physical dexterous hand-arm platform, improving success rates on deployed tasks. Project videos and additional details are at https://vibeact.github.io/.

摘要 (ZH)

灵巧操作依赖于快速、局部且常被视觉遮挡的接触事件。压电麦克风提供了一种紧凑且高带宽的交互感知方式，但由此产生的振动声学信号难以被足够逼真地模拟，以用于灵巧机器人手的端到端仿真到现实策略学习。我们提出VibeAct框架，通过接触与滑移的共享物理表示，桥接真实振动触觉感知与基于仿真的强化学习。在真实环境中，我们将压电麦克风嵌入灵巧机器人手，通过遥操作收集振动声学数据，然后在校准的数字克隆中回放记录，自动标注每根手指的接触与滑移。触觉估计器从真实麦克风波形中预测接触与滑移，而操作策略在仿真中基于同一表示（直接从仿真接触计算得出）进行训练。这种解耦使策略能够利用快速触觉反馈，而无需模拟原始音频。在涵盖重抓取、手中重定向和插入的五项高接触任务中，VibeAct在仿真中持续优于基于本体感觉与点云的基线方法，在需要持续反应控制的任务中提升最大，其中连续的滑移幅度通道被证明是最具信息量的观测值。学到的策略可迁移至物理灵巧手-臂平台，提升了部署任务的成功率。项目视频及更多详情请见https://vibeact.github.io/。

← Back