Motion Control
Differentiable Contact Unlocks Real-World Locomotion Control
By smoothing hard contact dynamics inside a fully differentiable simulation, robust locomotion controllers can be learned efficiently without sacrificing physical fidelity. The resulting policies generate stable gaits, track velocity commands, and resist disturbances—while transferring zero-shot from simulation to real robots. The approach shows how contact-aware, gradient-based learning can produce deployable motion control for legged systems.
Transcript
00:00:00 In this work, we learn a deployable locomotion controller entirely within a differentiable simulation. We achieve this by analytically smoothing a hard contact simulation. To understand why this is crucial, let's review what happens when training with two common contact modeling approaches. Starting with a hard contact model. While the model is physically accurate,
00:00:20 gradient-based learning results in highly suboptimal locomotion behaviors. This is due to misleading gradients caused by discontinuous contact dynamics. A soft contact model used in recent works provides smooth gradients and thus enables successful learning. However, the learned behaviors do not transfer to the physically more realistic hard
00:00:41 contact scenario, making the investigated soft contact model unsuitable for deployment. Training with the proposed smooth contact model results in effective locomotion gates while transferring to the hard contact simulation without any noticeable difference in behavior at the same time. With this we are able to transfer a
00:01:02 learned policy zero shot to the real world. To the best of our knowledge this is the first successful simia transfer of electrocomotion policy learned within a fully differentiable simulation. The learned controller is able to track arbitrary velocity commands while being robust to external disturbances. During training, the short horizon actor critic algorithm makes use of the
00:01:24 gradients provided by the simulation to increase sample efficiency by over an order of magnitude compared to standard reinforcement learning approaches. The increased sample efficiency could enable learning highdimensional tests such as directly learning from vision. To summarize, our contact formulation enabled successful learning and transfer by combining informative gradients and
00:01:45 physical fidelity.

