# List of exercises

You will find below links to download the notebooks for the exercises (which you have to fill) and their solution (which you can look at after you finished the exercise). It is recommended not to look at the solution while doing the exercise unless you are lost.

Alternatively, you can run the notebooks directly on Colab (https://colab.research.google.com/) if you have a Google account.

For instructions on how to install a Python distribution on your computer, check this page.

You will also find videos presenting the exercises and commenting their solution.

The solution of each exercise is rendered in the following pages.

## Introduction to Python

This exercise is an introduction to Python for absolute beginners. If you already know Python, you can safely skip it.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## Numpy and Matplotlib

The goal of this exercise is to present the basics of the numerical library `numpy`

as well as the visualization library `matplotlib`

.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## Sampling

The goal of this exercise is to understand how to sample rewards from a n-armed bandits and to understand the central limit theorem.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## Bandits - part 1

The goal of this exercise is to implement simple action selection mechanisms for the n-armed bandit:

- Greedy action selection
- \epsilon-greedy action selection
- Softmax action selection

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## Bandits - part 2

The goal of this exercise is to further investigate the properties of the action selection algorithms for the n-armed bandit.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## Dynamic programming

The goal of this exercise is to apply policy iteration and value iteration on the recycling robot MDP.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## Gym

The goal of this exercise is to install `gym`

and learn how to use its interface.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## Monte Carlo control

The goal of this exercise is to implement on-policy Monte-Carlo control on the Taxi gym environment.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## Temporal difference

The goal of this exercise is to implement Q-learning on the Taxi gym environment.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## Eligibility traces

The goal of this exercise is to implement Q-learning with eligibility traces on the Gridworld environment.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## Keras

The goal of this exercise is to quickly discover keras and to understand why neural networks (and SGD) need i.i.d. samples.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**

## DQN

The goal of this exercise is to implement DQN on the Cartpole balancing problem.

**Notebook:** download .ipynb or run on colab.

**Solution:** download .ipynb or run on colab.

**Presentation**

**Commented solution**