Simple World Models for Policy Evolution

Humans build an internal mental model of the world and routinely use simulation to update it state. The work in World Models uses RNNs and Reinforcement Learning (RL) to attempt to combine into asn agent a substantial model of the world and a smaller controller model that can take optimal decisions.

Task 1 (50 points)

Read the paper and write your own 4-page report of the technique, the model-based RL, in a tutorial like fashion so computer scientists can still understand it.

Task 2 (50 points)

In this task you are asked to reproduce the results for the car racing environment. Adopt this repo and replicate the car racing environment results.

Comment on the speed of learning a policy that drive the car around the track (10 points)