Reinforcement learning for a custom MuJoCo Hopper with PPO, REINFORCE, and Actor-Critic, featuring domain randomization, curriculum learning, and entropy scheduling for robust locomotion under uncertain dynamics
python reinforcement-learning reinforce actor-critic mujoco ppo domain-randomization robust-rl custom-environment sim-to-real stable-baselines3 robot-locomotion hopper-environment policy-gradient-methods curriculum-domain-randomization
-
Updated
Mar 19, 2026 - Python