The latest release of Upkie's software brings a functional reinforcement learning pipeline with sim-to-real transfer. The pipeline is based on Stable Baselines3, with standard sim-to-real tricks that could very well work on other wheeled biped robots. The pipeline trains on the Gymnasium environments in upkie.envs
(pip-installable from PyPI) and is implemented in the PPO balancer. Here is a policy trained in Bullet and running on a real Upkie:
There is also a usage video showing how to run the pipeline:
Hoping this helps newcomers get started with reinforcement learning on real robots!
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.