1.2. Introduction To Python

Python is a powerful, flexible programming language widely used for scientific computing, in web/Internet development, to write desktop graphical user interfaces (GUIs), create games, and much more. It became the de facto standard for machine learning, with a huge variety of specialized libraries such as:

Python is an high-level, interpreted, object-oriented language written in C, which means it is compiled on-the-fly, at run-time execution. Its syntax is close to C, but without prototyping (whether a variable is an integer or a string will be automatically determined by the context). It can be executed either directly in an interpreter (à la Matlab), in a script or in a notebook (as here).

The documentation on Python can be found at http://docs.python.org.

Many resources to learn Python exist on the Web:

This notebook only introduces you to the basics, so feel free to study additional resources if you want to master Python programming.

1.2.1. Installation

Python should be already installed if you use Linux, a very old version if you use MacOS, and probably nothing under Windows. Moreover, Python 2.7 became obsolete in December 2019 but is still the default on some distributions.

For these reasons, we strongly recommend installing Python 3 using the Anaconda distribution:

https://www.anaconda.com/products/individual

Anaconda offers all the major Python packages in one place, with a focus on data science and machine learning. To install it, simply download the installer / script for your OS and follow the instructions. Beware, the installation takes quite a lot of space on the disk (around 1 GB), so choose the installation path wisely.

To install packages (for example tensorflow), you just have to type in a terminal:

conda install tensorflow

Refer to the docs (https://docs.anaconda.com/anaconda/) to know more. If you prefer your local Python installation, the pip utility allows to also install virtually any Python package:

pip install tensorflow

Another option is to run the notebooks in the cloud, for example on Google Colab:

https://colab.research.google.com/

Colab has all major ML packages already installed, so you do not have to care about anything. Under conditions, you can also use a GPU for free (but for maximally 24 hours in a row).

1.2.2. Working With Python

There are basically three ways to program in Python: the interpreter for small commands, scripts for longer programs and notebooks (as here) for interactive programming.

1.2.2.1. Python Interpreter

To start the Python interpreter, simply type its name in a terminal under Linux:

user@machine ~ $ python
Python 3.7.4 (default, Jul 16 2019, 07:12:58) 
[GCC 9.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

You can then type anything at the prompt, for example a print statement:

>>> print("Hello World!")
Hello World!

To exit Python call the exit() function (or Ctrl+d):

>>> exit()

1.2.2.2. Scripts

Instead of using the interpreter, you can run scripts which can be executed sequentially. Simply edit a text file called MyScript.py containing for example:

# MyScript.py
# Implements the Hello World example.

text = 'Hello World!' # define a string variable

print(text)

The # character is used for comments. Execute this script by typing in a Terminal:

python MyScript.py

As it is a scripted language, each instruction in the script is executed from the beginning to the end, except for the declared functions or classes which can be used later.

1.2.2.3. Jupyter Notebooks

A third recent (but very useful) option is to use Jupyter notebooks (formerly IPython notebooks).

Jupyter notebooks allow you to edit Python code in your browser (but also Julia, R, Scala…) and run it locally.

To launch a Jupyter notebook, type in a terminal:

jupyter notebook

and create a new notebook (Python 3)

When a Jupyter notebook already exists (here 1-Python.ipynb), you can also start it directly:

jupyter notebook 1-Python.ipynb

Alternatively, Jupyter lab has a more modern UI, but is still in beta.

The main particularity of notebooks is that code is not executed sequentially from the beginning to the end, but only when a cell is explicitly run with Ctrl + Enter (the active cell stays the same) or Shift + Enter (the next cell becomes active).

To edit a cell, select it and press Enter (or double-click).

Q: In the next cell, run the Hello World! example:

text = 'Hello World!'
print(text)
Hello World!

There are three types of cells:

  • Python cells allow to execute Python code (the default)

  • Markdown cells which allow to document the code nicely (code, equations), like the current one.

  • Raw cell are passed to nbconvert directly, it allows you to generate html or pdf versions of your notebook (not used here).

Beware that the order of execution of the cells matters!

Q: In the next three cells, put the following commands:

  1. text = "Text A"

  2. text = "Text B"

  3. print(text)

and run them in different orders (e.g. 1, 2, 3, 1, 3)

text = "Text A"
text = "Text B"
print(text)
Text B

Executing a cell can therefore influence cells before and after it. If you want to run the notebook sequentially, select Kernel/Restart & Run all.

Take a moment to explore the options in the menu (Insert cells, Run cells, Download as Python, etc).

1.2.3. Basics In Python

1.2.3.2. Data Types

As Python is an interpreted language, variables can be assigned without specifying their type: it will be inferred at execution time.

The only thing that counts is how you initialize them and which operations you perform on them.

a = 42          # Integer
b = 3.14159     # Double precision float
c = 'My string' # String
d = False       # Boolean
e = a > b       # Boolean

Q: Print these variables as well as their type:

print(type(a))
a = 42          # Integer
b = 3.14159     # Double precision float
c = 'My string' # String
d = False       # Boolean
e = a > b       # Boolean

print('Value of a is', a,', Type of a is:', type(a))
print('Value of b is', b,', Type of b is:', type(b))
print('Value of c is', c,', Type of c is:', type(c))
print('Value of d is', d,', Type of d is:', type(d))
print('Value of e is', e,', Type of e is:', type(e))
Value of a is 42 , Type of a is: <class 'int'>
Value of b is 3.14159 , Type of b is: <class 'float'>
Value of c is My string , Type of c is: <class 'str'>
Value of d is False , Type of d is: <class 'bool'>
Value of e is True , Type of e is: <class 'bool'>

1.2.3.3. Assignment Statement And Operators

1.2.3.3.1. Assignment Statement

The assignment can be done for a single variable, or for a tuple of variables separated by commas:

m = 5 + 7

x, y = 10, 20

a, b, c, d = 5, 'Text', None, x==y

Q: Try these assignments and print the values of the variables.

m = 5 + 7
x, y = 10, 20
a, b, c, d = 5, 'Text', None, x==y

print(m, x, y, a, b, c, d,)
12 10 20 5 Text None False

1.2.3.3.2. Operators

Most usual operators are available:

+ , - , * , ** , / , // , %
== , != , <> , > , >= , < , <=
and , or , not

Q: Try them and comment on their behaviour. Observe in particular what happens when you add an integer and a float.

x = 3 + 5.
print(x, type(x))
8.0 <class 'float'>

Q: Notice how integer division is handled by python 3 by dividing an integer by either an integer or a float:

print(5/2)
print(5/2.)
2.5
2.5

1.2.3.4. Strings

A string in Python can be surrounded by either single or double quotes (no difference as long as they match). Three double quotes allow to print new lines directly (equivalent of \n in C).

Q: Use the function print() to see the results of the following statements:

a = 'abc'

b = "def"

c = """aaa
bbb
ccc"""

d = "xxx'yyy"

e = 'mmm"nnn'

f = "aaa\nbbb\nccc"
a = 'abc'
b = "def"
c = """aaa
bbb
ccc"""
d = "xxx'yyy"
e = 'mmm"nnn'
f = "aaa\nbbb\nccc"

print(a)
print(b)
print(c)
print(d)
print(e)
print(f)
abc
def
aaa
bbb
ccc
xxx'yyy
mmm"nnn
aaa
bbb
ccc

1.2.3.5. Lists

Python knows a number of compound data types, used to group together other values. The most versatile is the list, which can be written as a list of comma-separated values (items) between square brackets []. List items need not all to have the same type.

a = ['spam', 'eggs', 100, 1234]

Q: Define a list of various variables and print it:

a = ['spam', 'eggs', 100, 1234]

print(a)
['spam', 'eggs', 100, 1234]

The number of items in a list is available through the len() function applied to the list:

len(a)

Q: Apply len() on the list, as well as on a string:

print("Length of the list:", len(a))
print("Length of the word spam:", len('spam'))
Length of the list: 4
Length of the word spam: 4

To access the elements of the list, indexing and slicing can be used.

  • As in C, indices start at 0, so a[0] is the first element of the list, a[3] is its fourth element.

  • Negative indices start from the end of the list, so a[-1] is the last element, a[-2] the last but one, etc.

  • Slices return a list containing a subset of elements, with the form a[start:stop], stop being excluded. a[1:3] returns the second and third elements. WHen omitted, start is 0 (a[:2] returns the two first elements) and stop is the length of the list (a[1:] has all elements of a except the first one).

Q: Experiment with indexing and slicing on your list.

print(a)
print("a[0]", a[0])
print("a[3]", a[3])
print("a[-1]", a[-1])
print("a[1:3]", a[1:3])
['spam', 'eggs', 100, 1234]
a[0] spam
a[3] 1234
a[-1] 1234
a[1:3] ['eggs', 100]

Copying lists can cause some problems:

a = [1,2,3] # Initial list

b = a # "Copy" the list by reference 

a[0] = 9 # Change one item of the initial list

Q: Now print a and b. What happens?

a = [1,2,3] # Initial list

b = a # "Copy" the list by reference 

a[0] = 9 # Change one item of the initial list

print('a :', a)
print('b :', b)
a : [9, 2, 3]
b : [9, 2, 3]

A: B = A does not make a copy of the content of A, but of its reference (pointer). So a and b both points at the same object.

The solution is to use the built-in copy() method of lists:

b = a.copy()

Q: Try it and observe the difference.

a = [1, 2, 3]
b = a.copy()
a[0] = 9

print(a)
print(b)
[9, 2, 3]
[1, 2, 3]

Lists are objects, with a lot of different built-in methods (type help(list) in the interpreter or in a cell):

  • a.append(x): Add an item to the end of the list.

  • a.extend(L): Extend the list by appending all the items in the given list.

  • a.insert(i, x): Insert an item at a given position.

  • a.remove(x): Remove the first item from the list whose value is x.

  • a.pop(i): Remove the item at the given position in the list, and return it.

  • a.index(x): Return the index in the list of the first item whose value is x.

  • a.count(x): Return the number of times x appears in the list.

  • a.sort(): Sort the items of the list, in place.

  • a.reverse(): Reverse the elements of the list, in place.

Q: Try out quickly these methods, in particular append() which we will use quite often.

a = [1, 2, 3]

a.append(4)

print(a)
[1, 2, 3, 4]

1.2.3.6. Dictionaries

Another useful data type built into Python is the dictionary. Unlike lists, which are indexed by a range of numbers from 0 to length -1, dictionaries are indexed by keys, which can be any immutable type; strings and numbers can always be keys.

Dictionaries can be defined by curly braces {} instead of square brackets. The content is defined by key:item pairs, the item can be of any type:

tel = {
    'jack': 4098, 
    'sape': 4139
}

To retrieve an item, simply use the key:

tel_jack = tel['jack']

To add an entry to the dictionary, just use the key and assign a value to the item. It automatically extends the dictionary (warning, it can be dangerous!).

tel['guido'] = 4127

Q: Create a dictionary and elements to it.

tel = {'jack': 4098, 'sape': 4139}
tel_jack = tel['jack']
tel['guido'] = 4127

print(tel)
print(tel_jack)
{'jack': 4098, 'sape': 4139, 'guido': 4127}
4098

The keys() method of a dictionary object returns a list of all the keys used in the dictionary, in the order in which you added the keys (if you want it sorted, just apply the sorted() function on it).

a = tel.keys()
b = sorted(tel.keys())

values() does the same for the value of the items:

c = tel.values()

Q: Do it on your dictionary.

a = tel.keys()
b = sorted(a)
c = tel.values()

print(a)
print(b)
print(c)
dict_keys(['jack', 'sape', 'guido'])
['guido', 'jack', 'sape']
dict_values([4098, 4139, 4127])

1.2.3.7. If Statement

Perhaps the most well-known conditional statement type is the if statement. For example:

if x < 0 :
    print('x =', x, 'is negative')
elif x == 0:
    print('x =', x, 'is zero')
else:
    print('x =', x, 'is positive')

Q: Give a value to the variable x and see what this statement does.

x = 5

if x < 0 :
    print('x =', x, 'is negative')
elif x == 0:
    print('x =', x, 'is zero')
else:
    print('x =', x, 'is positive')
x = 5 is positive

Important! The main particularity of the Python syntax is that the scope of the different structures (functions, for, if, while, etc…) is defined by the indentation, not by curly braces {}. As long as the code stays at the same level, it is in the same scope:

if x < 0 :
    print('x =', x, 'is negative')
    x = -x
    print('x =', x, 'is now positive')
elif x == 0:
    print('x =', x, 'is zero')
else:
    print('x =', x, 'is positive')

A reasonable choice is to use four spaces for the indentation instead of tabs (configure your text editor if you are not using Jupyter).

When the scope is terminated, you need to come back at exactly the same level of indentation. Try this misaligned structure and observe what happens:

if x < 0 :
    print('x =', x, 'is negative')
 elif x == 0:
    print('x =', x, 'is zero')
 else:
    print('x =', x, 'is positive')

Jupyter is nice enough to highlight it for you, but not all text editors do that…

if x < 0 :
    print('x =', x, 'is negative')
 elif x == 0:
    print('x =', x, 'is zero')
 else:
    print('x =', x, 'is positive')
  File "<tokenize>", line 3
    elif x == 0:
    ^
IndentationError: unindent does not match any outer indentation level

In a if statement, there can be zero or more elif parts. What to do when the condition is true should be indented. The keyword "elif" is a shortened form of "else if", and is useful to avoid excessive indentation. An if ... elif ... elif ... sequence is a substitute for the switch or case statements found in other languages.

The elif and else statements are optional. You can also only use the if statement alone:

a = [1, 2, 0]
has_zero = False
if 0 in a:
    has_zero = True

Note the use of the in keyword to know if an element exists in a list.

1.2.3.8. For loops

The for statement in Python differs a bit from what you may be used to in C, Java or Pascal.

Rather than always iterating over an arithmetic progression of numbers (like in Pascal), or giving the user the ability to define both the iteration step and halting condition (as C), Python’s for statement iterates over the items of any sequence (a list or a string), in the order they appear in the sequence.

list_words = ['cat', 'window', 'defenestrate']

for word in list_words:
    print(word, len(word))

Q: Iterate over the list you created previously and print each element.

a = ['spam', 'eggs', 100, 1234]

for el in a:
    print(el)
spam
eggs
100
1234

If you do need to iterate over a sequence of numbers, the built-in function range() comes in handy. It generates lists containing arithmetic progressions:

for i in range(5):
    print(i)

Q: Try it.

for i in range(5):
    print(i)
0
1
2
3
4

range(N) generates a list of N number starting from 0 until N-1.

It is possible to specify a start value (0 by default), an end value (excluded) and even a step:

range(5, 10)
range(5, 10, 2)

Q: Print the second and fourth elements of your list (['spam', 'eggs', 100, 1234]) using range().

for i in range(1, 4, 2):
    print(a[i])
eggs
1234

To iterate over all the indices of a list (0, 1, 2, etc), you can combine range() and len() as follows:

for idx in range(len(a)):

The enumerate() function allows to get at the same time the index and the content:

for i, val in enumerate(a):
    print(i, val)
for i, val in enumerate(a):
    print(i, val)
0 spam
1 eggs
2 100
3 1234

To get iteratively the keys and items of a dictionary, use the items() method of dictionary:

for key, val in tel.items():

Q: Print one by one all keys and values of your dictionary.

tel = {'jack': 4098, 'sape': 4139, 'guido': 4127}

for name, number in tel.items():
    print(name, number)
jack 4098
sape 4139
guido 4127

1.2.3.9. Functions

As in most procedural languages, you can define functions. Functions are defined by the keyword def. Only the parameters of the function are specified (without type), not the return values.

The content of the function has to be incremented as with for loops.

Return values can be specified with the return keywork. It is possible to return several values at the same time, separated by commas.

def say_hello_to(first, second):
    question = 'Hello, I am '+ first + '!'
    answer = 'Hello '+ first + '! I am ' + second + '!'
    return question, answer

To call that function, pass the arguments that you need and retrieve the retruned values separated by commas.

question, answer = say_hello_to('Jack', 'Gill')

Q: Test it with different names as arguments.

def say_hello_to(first, second):
    question = 'Hello, I am '+ first + '!'
    answer = 'Hello '+ first + '! I am ' + second + '!'
    return question, answer

question, answer = say_hello_to('Jack', 'Gill')

print(question)
print(answer)
Hello, I am Jack!
Hello Jack! I am Gill!

Q: Redefine the tel dictionary {'jack': 4098, 'sape': 4139, 'guido': 4127} if needed, and create a function that returns a list of names whose number is higher than 4100.

def filter_dict(tel):
    answer = []
    for name, number in tel.items():
        if number >= 4100:
            answer.append(name)
    return answer

tel = {'jack': 4098, 'sape': 4139, 'guido': 4127}
names = filter_dict(tel)
print(names)
['sape', 'guido']

Functions can take several arguments (with default values or not). The name of the argument can be specified during the call, so their order won’t matter.

Q: Try these different calls to the say_hello_to() function:

question, answer = say_hello_to('Jack', 'Gill')
question, answer = say_hello_to(first='Jack', second='Gill')
question, answer = say_hello_to(second='Jack', first='Gill')
question, answer = say_hello_to('Jack', 'Gill')
print(question)
print(answer)
question, answer = say_hello_to(first='Jack', second='Gill')
print(question)
print(answer)
question, answer = say_hello_to(second='Jack', first='Gill')
print(question)
print(answer)
Hello, I am Jack!
Hello Jack! I am Gill!
Hello, I am Jack!
Hello Jack! I am Gill!
Hello, I am Gill!
Hello Gill! I am Jack!

Default values can be specified for the last arguments, for example:

def add (a, b=1):
    return a + b

x = add(2, 3) # returns 5
y = add(2) # returns 3
z = add(a=4) # returns 5

Q: Modify say_hello_to() so that the second argument is your own name by default.

def say_hello_to(first, second="Julien"):
    question = 'Hello, I am '+ first + '!'
    answer = 'Hello '+ first + '! I am ' + second + '!'
    return question, answer

question, answer = say_hello_to('Jack', 'Gill')
print(question)
print(answer)

question, answer = say_hello_to('Jack')
print(question)
print(answer)
Hello, I am Jack!
Hello Jack! I am Gill!
Hello, I am Jack!
Hello Jack! I am Julien!

1.2.3.10. Classes

Classes are structures allowing to:

  1. Store information in an object.

  2. Apply functions on this information.

In a nutshell:

class Foo:
    
    def __init__(self, val):
        self.val = val
        
    def print(self):
        print(self.val)
   
    def set_val(self, val):
        self.val = val
        self.print()
        
a = Foo(42)
a.print()

This defines the class Foo. The first (obligatory) method of the class is the constructor __init__. This determines how the instance a will be instantiated after the call to a = Foo(42). The first argument is self, which represents the current instance of the class. We can specify other arguments to the constructor (here val), which can be processed or stored.

Here we store val as an attribute of the class Foo with self.val. It is data that will be specific to each created object: if you create b = Foo("deep learning"), the attribute self.val will have different values between the two instances. As always in Python, the type does not matter, it can be a float, a string, a numpy array, another object…

Attributes are accessible from each object as:

x = a.val

You can set its value by:

a.val = 12

Classes can define methods that can manipulate class attributes like any regular function. The first argument must always be self. With the self object, you can access all attributes (or other methods) of the instance.

With our toy class, a.set_val(34) does exactly the same as a.val = 34, or a.print() is the same as print(a.val).

For C++/Java experts: attributes and methods are always public in Python. If you want to make an attribute private, preprend its name with an underscore, e.g. self._val. It will then not be part of the API of the class (but can be read or written publicly anyway…).

Q: Play around with this basic class, create different objects with different attributes, print them, change them, etc.

class Foo:
    
    def __init__(self, val):
        self.val = val
    
    def print(self):
        print(self.val)
    
    def set_val(self, val):
        self.val = val
        self.print()

a = Foo(42)
a.print()
print(a.val)
a.set_val(32)
a.print()
42
42
32
32

A major concept in object-oriented programming (OOP) is class inheritance. We will not use it much in these exercises, but let’s talk shortly about it.

Inheriting a class is creating a new class that inherits from the attributes and methods of another class (a kind of “copy” of the definition of the class). You can then add new attributes or methods, or overload existing ones.

Example:

class Bar(Foo):
    def add(self, val):
        self.val += val
    def print(self):
        print("val =", self.val)

Bar is a child class of Foo. It inherits all attributes and methods, including __init__, print and set_val. It creates a new method add and overloads print: the old definition of print in Foo does not exist anymore for instances of the Bar class (but does for instances of the Foo class). The constructor can also be overloaded, for example to add new arguments:

class Bar(Foo):
    def __init__(self, val, val2):
        self.val2 = val2
        super().__init__(val)
    def add(self, val):
        self.val += val
    def print(self):
        print("val =", self.val)

super().__init__(val) calls the constructor of the Foo class (the “super” class of bar), so it sets the value of self.val.

Q: Play around with inheritance to understand the concept.

class Bar(Foo):
    def __init__(self, val, val2):
        self.val2 = val2
        super().__init__(val)
    def add(self, val):
        self.val += val
    def print(self):
        print("val =", self.val)
        
b = Bar(12, 23)
b.add(30)
b.print()
val = 42

1.2.4. Exercise

In cryptography, a Caesar cipher is a very simple encryption technique in which each letter in the plain text is replaced by a letter some fixed number of positions down the alphabet. For example, with a shift of 3, A would be replaced by D, B would become E, and so on. The method is named after Julius Caesar, who used it to communicate with his generals. ROT-13 (“rotate by 13 places”) is a widely used example of a Caesar cipher where the shift is 13. In Python, the key for ROT-13 may be represented by means of the following dictionary:

code = {'a':'n', 'b':'o', 'c':'p', 'd':'q', 'e':'r', 'f':'s',
        'g':'t', 'h':'u', 'i':'v', 'j':'w', 'k':'x', 'l':'y',
        'm':'z', 'n':'a', 'o':'b', 'p':'c', 'q':'d', 'r':'e',
        's':'f', 't':'g', 'u':'h', 'v':'i', 'w':'j', 'x':'k',
        'y':'l', 'z':'m', 'A':'N', 'B':'O', 'C':'P', 'D':'Q',
        'E':'R', 'F':'S', 'G':'T', 'H':'U', 'I':'V', 'J':'W',
        'K':'X', 'L':'Y', 'M':'Z', 'N':'A', 'O':'B', 'P':'C',
        'Q':'D', 'R':'E', 'S':'F', 'T':'G', 'U':'H', 'V':'I', 
        'W':'J', 'X':'K', 'Y':'L', 'Z':'M'}

Q: Your task in this final exercise is to implement an encoder/decoder of ROT-13. Once you’re done, you will be able to read the following secret message:

Jnvg, jung qbrf vg unir gb qb jvgu qrrc yrneavat??

The idea is to write a decode() function taking the message and the code dictionary as inputs, and returning the decoded message. It should iterate over all letters of the message and replace them with the decoded letter. If the letter is not in the dictionary (e.g. punctuation), keep it as it is.

# Method to decode a message
def decode(msg, code):
    result = ""
    for letter in msg:
        if letter in code.keys():
            result += code[letter]
        else:
            result += letter
    return result

# Message to decode
msg = "Jnvg, jung qbrf vg unir gb qb jvgu qrrc yrneavat??"

# Decode the message
decoded = decode(msg, code)
print(decoded)