Neural Network learns to land a rocket using Pytorch, Unity's MLAgents and PPO.
Gfycat link, higher quality video
A little project I did to learn MLAgents basics. I need to clean the code a lot and upload the Unity Exec for training, which uses 8 agents at the same time. I used OpenAI's awesome Proximal Policy Optimiazion implemented on Pytorch and trained the model on Google Cloud. It took aproximately 12h to train to the point showed on the gif (arount 15 million steps).