\(\epsilon\)-greedy Monte-Carlo (MC) Control
In this section we outline methods that can result in optimal policies when the MDP is unknown and we need to learn its underlying functions / models - also known as the mode…
Optimal Fully Observed Sequential Decision Making