MS-PPO: Morphological-Symmetry-Equivariant Policy for Legged Robot Locomotion
Submitted to ICRA 2026
Sizhe Wei*,
Xulin Chen*, Fengze Xie, Garrett E. Katz, Zhenyu Gan, Lu Gan.
[PDF]
|
[DEMO]
Abstract:
Reinforcement learning has recently enabled impressive locomotion performance on quadrupeds and other articulated robots, yet most policy architectures remain morphology- and symmetry-agnostic, leading to inefficient training and weak generalization. This work introduces MS-PPO, a morphology-symmetry-equivariant policy learning framework that encodes robot kinematic structure and morphological symmetries directly into the policy network. We construct a graph-based neural architecture that is provably equivariant to morphological group actions, ensuring consistent responses under sagittal-plane reflections while maintaining
invariance in value estimation. This design eliminates the need for reward shaping or data augmentation, which are typically required to enforce symmetry. We evaluate MS-PPO in simulation on Unitree Go2 and Xiaomi Cyberdog2 robots across multiple locomotion tasks, including trotting, pronking, slope walking, and bipedal turning, and further deploy policies on hardware. Extensive experiments show that MS-PPO achieves superior training stability, command generalization ability, and sample efficiency in challenging tasks, compared to state-of-the-art baselines. These findings demonstrate that embedding both kinematic structure and morphological symmetry into policy learning provides a powerful inductive bias for legged robot locomotion control.
... See More