List of exercises

You will find below links to download the notebooks for the exercises (which you have to fill) and their solution (which you can look at after you finished the exercise). It is recommended not to look at the solution while doing the exercise unless you are lost.

Alternatively, you can run the notebooks directly on Colab (https://colab.research.google.com/) if you have a Google account.

For instructions on how to install a Python distribution on your computer, check this page.

You will also find videos presenting the exercises and commenting their solution.

The solution of each exercise is rendered in the following pages.

Introduction to Python

This exercise is an introduction to Python for absolute beginners. If you already know Python, you can safely skip it.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

Numpy and Matplotlib

The goal of this exercise is to present the basics of the numerical library numpy as well as the visualization library matplotlib.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

Sampling

The goal of this exercise is to understand how to sample rewards from a n-armed bandits and to understand the central limit theorem.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

Bandits - part 1

The goal of this exercise is to implement simple action selection mechanisms for the n-armed bandit:

  • Greedy action selection
  • \epsilon-greedy action selection
  • Softmax action selection

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

Bandits - part 2

The goal of this exercise is to further investigate the properties of the action selection algorithms for the n-armed bandit.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

Dynamic programming

The goal of this exercise is to apply policy iteration and value iteration on the recycling robot MDP.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

Gym

The goal of this exercise is to install gym and learn how to use its interface.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

Monte Carlo control

The goal of this exercise is to implement on-policy Monte-Carlo control on the Taxi gym environment.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

Temporal difference

The goal of this exercise is to implement Q-learning on the Taxi gym environment.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

Eligibility traces

The goal of this exercise is to implement Q-learning with eligibility traces on the Gridworld environment.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

Keras

The goal of this exercise is to quickly discover keras and to understand why neural networks (and SGD) need i.i.d. samples.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

DQN

The goal of this exercise is to implement DQN on the Cartpole balancing problem.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.

PPO

The goal of this exercise is to use the tianshou library to implement PPO on the Cartpole balancing problem.

Notebook: download .ipynb or run on colab.

Solution: download .ipynb or run on colab.