Adapting Rapid Motor Adaptation for Bipedal Robots

Ashish Kumar*¹

Zhongyu Li*¹

Jun Zeng¹

Deepak Pathak²

Koushil Sreenath¹

Jitendra Malik¹

¹ University of California at Berkeley, ² Carnegie Mellon University

IROS 2022 Paper PDF

Recent advances in legged locomotion have enabled quadrupeds to walk on challenging terrain. However, bipedal robots are inherently more unstable and hence its harder to design walking controllers for them. In this work, we leverage the recent advances in rapid adaptation for walking, and extend them to work on bipedal robots. Similar to existing works, we start with a base policy which produces actions while taking as input an estimated extrinsics vector from an adaptation module. This extrinsics vector contains information about the environment and enables the walking controller to rapidly adapt online. However, the extrinscis estimator could be imperfect, which might lead to poor performance of the base policy which expects a perfect estimator. We propose A-RMA (adapting RMA), which additionally adapts the base policy for the imperfect extrinsics estimator by finetuing it using model free RL. We demonstrate that A-RMA outperforms a number of RL based baseline controllers and model based controllers in simulation, and show zero-shot deployment of a single A-RMA policy to enable Cassie (a bipedal robot) to walk in a variety of different scenarios beyond what it has seen during training.

A-RMA Results

A-RMA Method

The first two phases are the same as RMA. In the first phase, the base policy takes as input the current state, previous action and the extrinsics vector which contains a compressed version of the privileged environmental factors. The base policy is trained in simulation using model-free RL. In the second phase, the adaptation module is trained to predict the extrinsic from the history of state and actions via supervised learning with on-policy data. We additionally add a third phase in which the base policy is fine-tuned again with PPO while keeping the adaptation module fixed, to account for imperfect estimation of extrinsics. We found this to be critical for reliable performance in the real world.

Related Line of work

RMA: Rapid Motor Adaptation for Legged Robots
Ashish Kumar, Zipeng Fu, Deepak Pathak, Jitendra Malik
RSS 2021

Pdf | Video | Project Page

Minimizing Energy Consumption Leads to the Emergence of Gaits in Legged Robots
Zipeng Fu, Ashish Kumar, Jitendra Malik, Deepak Pathak
CoRL 2021

Pdf | Video | Project Page

Coupling Vision and Proprioception for Navigation of Legged Robots
Zipeng Fu*, Ashish Kumar*, Ananye Agarwal, Haozhi Qi, Jitendra Malik, Deepak Pathak

Pdf | Video | Project Page