13. Exercises#
13.1. Introduction#
These exercises are meant for you to practice your skills. The sections are numbered according to the chapter numbering so you know where to look for the exercises belonging to a particular chapter.
You can deal with the exercises in several ways. The most convenient is to clone or download this notebook by clicking on the download icon at the top of this page, and work on the notebook on your own machine. You can also clone or download the entire repository by clicking on the github icon (the little cat). The exercises notebook is in the _sources
folder. That has the advantage that all data files are included (the data
folder).
Hints and/or solutions are often included; they can be displayed where it says ”➥ Click to see the solution” or ”➥ Give me a hint”. Unfortunately, rendering of these hints is not flawless: they only get displayed correctly in a hosted notebook environment (not in a static viewer such as nbviewer or github).
Give it a try:
➥ Click to see solution!
first_name = "John"
surname = "Doe"
print(f'good morning, {first_name} {surname}!')
Of course, you should always really try to solve it yourself before going to the easy-peasy zone
With some exercises, solutions can be loaded from file by uncommenting and running the commented line of code that looks like this:
# Uncomment the following line to see the solution or code hint
# %load ./exercise_solutions/exercise_1_1.py
Give it a try in the cell below.
# Uncomment the following line to see the solution or code hint
#%load ./exercise_solutions/exercise_0_0.py
13.2. Getting started#
13.2.1. Simple math#
First enter
import math
and press enter to load the math module.Inspect the value of
math.pi
and try out the functionmath.sqrt()
.Calculate the following (using
math.pi
andmath.sqrt()
where relevant).
\(\frac{4.6 + 1.2}{2.09}\)
\(3\times4^\frac{7}{12}\)
\(r = 6\)
\(\frac{2}{3}\times \pi r^3\)\(5 \times (\frac{4 + 2}{\sqrt{7} \times 9})\)
13.2.2. Fixing errors#
A
The code below does not work. Can you fix it?
print('I woke up at eight 'o clock this morning')
# Your code
B
The code below does not work. Can you fix it? Try to Google the error if you get stuck.
name = "John"
age = 42
print('My name is' + name + 'and my age is ' + age)
# Your code
C
In the code cell below, it was attempted to calculate the surface area of a circle. However, because of operator precedence the outcome is wrong! It should be 12.57. Can you correct this by using parentheses in the calculation? While you are at it, also add some spaces to make the code more readable.
import math
def circle_area(diameter):
return math.pi*1/2*diameter**2
circle_area(4)
25.132741228718345
➥ Click to see solution!
return math.pi * (1/2 * diameter)**2
D
Correct the calculations below by making use of grouping parentheses. It is all about Operator precedence.
10 - 7 // 2 * 3 + 1 = -2
45 % 10 / 2 = 0
27 * 2 + 46 ** 0.5 = 10
5 * 2 // 3 = 0
6 + 4 * 2 - 10 // 2 - 4 * 2 = -3
2 ** 3 ** 2 = 64
5 + 3 * 2 ** 2 = 32
➥ Give me a hint!
10 - 7 // 2 * (3 + 1)
is the solution for the first
➥ Give me the solution
10 - 7 // 2 * (3 + 1)
45 % (10 / 2)
(27 * 2 + 46) ** 0.5
5 * (2 // 3)
6 + (4 * 2 - 10) // 2 - 4 * 2
(2 ** 3) ** 2
(5 + 3) * 2 ** 2
13.2.3. Working with variables#
Given the code chunks below, and the expected value(s) as defined by the assert y == some_value
statement, give the variable(s) in the chunk a correct initial value.
The assert y == some_value
statement checks the expression y == some_value
and gives an error if it is not true
.
If your solution is correct, the chunk will not raise and AssertionError
A
x = 0 # correct value of x is?
y = x + 2
assert y == 3
# Correct code here
B
x = 0 # correct value of x is?
y = x * 2 + 5
assert y == 9
# Correct code here
C
x = 0 # correct value of x is?
y = 0 # correct value of y is?
z = x**y + 4
assert z == 20
# Correct code here
D
x = 0 # correct value of x is?
y = 0 # correct value of y is?
z = (x + y) / (y - x)
assert z == 4
# Correct code here
13.2.4. Types and Operators#
Given these variables:
x = 4
y = 5
z = 'hallo'
generate the requested outputs.
a) 'hallohallohallohallo'
b) 55555
c) 5'54545454'
x = 4
y = 5
z = 'hallo'
# Your code here
➥ Give me the solution
print(x * z)
print(str(y) * y)
print((str(y) + str(x)) * x)
13.2.5. Assignment shortcut operators#
The code cell below is not wrong, but it can be expressed more efficiently by using dedicated assignment operators. Can you improve by using these? Although flow control was not dealt with explicitly the code should be pretty obvious (this is one of the strengths of Python).
total = 1
fraction = 1
i = 1
for n in range(2, 6):
i = i + n
total = total + i
fraction = fraction / i
print(f'i is now {i}; the cumulative sum is {total} and the cumulative "fraction" is {fraction}')
i is now 3; the cumulative sum is 4 and the cumulative "fraction" is 0.3333333333333333
i is now 6; the cumulative sum is 10 and the cumulative "fraction" is 0.05555555555555555
i is now 10; the cumulative sum is 20 and the cumulative "fraction" is 0.005555555555555555
i is now 15; the cumulative sum is 35 and the cumulative "fraction" is 0.00037037037037037035
➥ Click to see solution!
i += n
total += i
fraction /= i
13.2.6. The Floor division and Modulo operators#
The floor division and modulo operators are handy tools if you want to work with currency, weight and distance units.
The modulo operator %
gives the remainder of a division:
for n in range(1,6):
print(f'{n} modulo 3 is {n % 3}')
1 modulo 3 is 1
2 modulo 3 is 2
3 modulo 3 is 0
4 modulo 3 is 1
5 modulo 3 is 2
The floor division operator //
gives the integer part of a division:
for n in range(1,6):
print(f'{n} floor divided by 3 is {n // 3}')
1 floor divided by 3 is 0
2 floor divided by 3 is 0
3 floor divided by 3 is 1
4 floor divided by 3 is 1
5 floor divided by 3 is 1
Now, suppose you want to create a tool converting from meters to imperial length units:
a yard is 0.9144 meters
a foot is 0.3048 meters
an inch is 2.54 centimeters
Using the above explained two operators, can you solve this problem? Use the correct assignment operator to store intermediate results.
meters = 234
yards = 0
feet = 0
inches = 0
# Your code
print(f'{meters} metric meters is equivalent to {yards} yards, {feet} feet and {inches} inches')
234 metric meters is equivalent to 0 yards, 0 feet and 0 inches
➥ Give me a hint!
yards = meters // yard
will calculate the yards from meters
➥ Give me another hint!
remainder = meters % yard
will calculate what is left after getting the yards.
➥ Give me the complete solution!
meters = 234
yards = 0
feet = 0
inches = 0
# Your code
yard = 0.9144
foot = 0.3048
inch = 2.54/100
yards = meters // yard
remainder = meters % yard
feet = remainder // foot
remainder %= foot
inches = remainder / inch # no need to floor here!
print(f'{meters} metric meters is equivalent to {yards} yards, {feet} feet and {inches} inches')
13.2.7. Run and edit a script#
On your computer, create a folder that will hold the exercises of this course. In it, put a copy of the script triangle_surface.py from the scripts folder. Run the script as demonstrated in chapter 2. Try out some other command-line arguments.
Change the script at some points and investigate the effect:
- comment-out (e.g. put a hash symbol #
in front of it) the import sys
statement
- change float(side)
to int(side)
- change sys.argv[1:]
to sys.argv[2]
and to sys.argv[2:]
- use your imagination and experiment further
13.3. Data types#
13.3.1. String methods#
From the string below, use methods from str
to capitalize the words and remove all whitespaces.
So, this string "The quick brown fox jumps over the lazy dog"
should become "TheQuickBrownFoxJumpsOverTheLazyDog"
.
Have a look at the str
help documentation to find out which functions you should use.
➥ Give me the solution
sentence = "The quick brown fox jumps over the lazy dog"
sentence = sentence.title()
sentence.replace(' ', '')
# or, in one chained statement:
#sentence.title().replace(' ', '')
sentence = "The quick brown fox jumps over the lazy dog"
13.3.2. String slicing#
Given this string:
letters = 'Een Aap Die Ijs Eet!'
write a slice that
a) prints 'Aap'
➥ Give me the solution
letters = 'Een Aap Die Ijs Eet!'
print(f'{letters[4:7]}')
b) prints 'EAD'
➥ Give me the solution
print(f'{letters[:9:4]}')
c) prints '!Eje Ae'
➥ Give me the solution
print(f'{letters[::-3]}')
d) prints ' !'
➥ Give me the solution
print(f'{letters[3::4]}')
letters = 'Een Aap Die Ijs Eet!'
#YOUR CODE
13.3.3. String formatting (challenge)#
Study this short string formatting tutorial and find out
how, given the variable
name = 'Bert'
, you can printHello, Bert, bye Bert'
in four different ways using 4 different string formattng techniques
➥ Give me the solution
name = 'Bert'
print("Hello, {}, bye {}".format(name, name))
print("Hello, {name1}, bye {name2}".format(name1 = name, name2 = name))
print("Hello, {0}, bye {0}".format(name))
print(f"Hello, {name}, bye {name}")
how to center a variable within a fixed 100-character wide field of spaces
➥ Give me the solution
print("XX{:^100}XX".format(name))
how to center a variable within a fixed 100-character wide field, filled up with asterisks
➥ Give me the solution
print("XX{:*^100}XX".format(name))
how to print the variable
number = 3124855.667698
with thousand separators and rounded at 2 decimals, right aligned in a field of 20 characters.
➥ Give me the solution
number = 3124855.667698
print("XX{:>20,.2f}XX".format(number))
name = 'Bert'
#YOUR CODE
13.3.4. Working with lists#
Given the starting list below, implement the required series of single-statement steps to go to each consecutive modification.
#feeding an iterable to the list constructor will give a list of individual elements, in this case the letters
letters = list('ABCDEFGHIJK')
a) ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M']
➥ Give me the solution
letters += list('LM')
# or letters += ['L', 'M']
# or letters.extend(['L', 'M'])
b) ['A', 'B', 'C', 'H', 'I', 'J', 'K', 'L', 'M']
➥ Give me the solution
letters[3:7] = []
c) ['A', 'B', 'C', 'X', 'Y', 'Z', 'L', 'M']
➥ Give me the solution
letters[3:7] = list('XYZ')
d) ['A', 'B', 'C', 'X', 'Y', 'Z', 'L', 'M', 'A', 'B', 'C', 'X', 'Y', 'Z', 'L', 'M']
➥ Give me the solution
letters = letters * 2
# or letters *= 2
letters = list('ABCDEFGHIJK')
# YOUR CODE
13.3.5. Joining strings#
The str
class has a method, join()
, that makes it possible to join a list (or other iterable) of strings into a single string, with separators in between.
Study its doc and, from the list
words = ['Ham', 'Spam', 'Jam', 'Mam', 'Dam', 'Ram']
generate the following (you may need to use list slicing as well).
a) 'HamSpamJamMamDamRam'
➥ Give me the solution
''.join(words)
b) 'Ham+-+Spam+-+Jam+-+Mam+-+Dam+-+Ram'
➥ Give me the solution
'+-+'.join(words)
c) 'Ham Jam Dam'
➥ Give me the solution
' '.join(words[::2])
d) 'RamnDamnMamnJamnSpamnHam'
➥ Give me the solution
'n'.join(words[::-1])
words = ['Ham', 'Spam', 'Jam', 'Mam', 'Dam', 'Ram']
# YOUR CODE
13.3.6. The difference between tuples and lists#
Given these two sequences, one a list and the other a tuple,
ingredients = ['sugar', 'butter', 'egg', 'flour']
additives = ('salt', 'vanilla', 'almond')
investigate whether the given operations can be done, and how to do them.
a) Add the additives to the ingredients
➥ Give me the solution
ingredients.extend(additives)
b) Replace sugar by saccharin
➥ Give me the solution
ingredients[0] = 'saccharin'
c) Add monosodium glutamate to the additives
➥ Give me the solution
Not possible
d) Make 'vanilla' uppercase
➥ Give me the solution
Not possible
e) Create this string. `almond and vanilla and salt and flour and egg and butter and sugar` This takes three separate operations.
➥ Give me the solution
ingredients.extend(additives)
ingredients.reverse()
' and '.join(ingredients)
ingredients = ['sugar', 'butter', 'egg', 'flour']
additives = ('salt', 'vanilla', 'almond')
## YOUR CODE
13.3.7. Choosing lists or tuples#
In essence, a tuple is a list that cannot be changed after it has been created. For the following use cases, choose the most appropriate of these two and implement the described use case.
a) Create a collection representing the board of the game ‘tic tac toe’ (Dutch: boter kaas en eieren) in which each ‘cell’ can (1) be empty - a space ‘ ‘ (2) have a cross ‘X’ or (3) a circle ‘O’. Remember, collections can be nested!
➥ Give me the solution
# using module pprint to get a nice 2D printed representation
import pprint
pp = pprint.PrettyPrinter(width = 20)
# top level should be tuple, "rows" should be lists
tic_tac_toe = ([' ', 'X', 'O'],
['X', 'O', ' '],
['O', 'X', ' '])
pp.pprint(tic_tac_toe)
b) A chess board has 64 fields, indicated by letters for the 8 columns (a-h) and numbers for the rows (1-8). See below.
So ‘a1’ is the lowerleft field and ‘h8’ the upper right one. A chess move consists of a piece (e.g. pawn, knight, queen etc) going from a field of origin (e.g. b2) to a field of destination (e.g. b4). The chess pieces are shown below.
Create a collection storing moves of a chess game. Demonstrate its use by creating and storing a few moves.
➥ Give me the solution
#moves must be a list because the should be added during the game!
moves = list()
# tuple is best for a single move; it always has the same three elements.
move = ('pawn', 'e2', 'e4')
moves.append(move)
move = ('pawn', 'e7', 'e5')
moves.append(move)
move = ('pawn', 'f2', 'f4')
moves.append(move)
move = ('pawn', 'e5', 'f4')
moves.append(move)
move = ('bishop', 'f1', 'c4')
moves.append(move)
print(moves)
13.3.8. Working with complex data structures#
Tuples are supposed to be immutable. Let’s explore the extend of this rule, and also some other behaviour of tuples, lists and dictionaries.
Given this tuple that is the top level container of this data structure:
ZP11 = ({'street': 'Zernikeplein',
'number': 11},
["Life Sciences", "Building", "ICT"],
('Wing A', 'Wing B', 'Wing C', 'Wing H'))
First think and deduce, then try and/or demonstrate these:
a) Add an element to ZP11; a single number (e.g. 1500)
➥ Give me the solution
This is not possible. If ZP11 were a list, this would have been the way:ZP11 += [1500]
b) Add an element, zipcode
, to the address dict (containing street
and number
)
➥ Give me the solution
ZP11[0]['zipcode'] = "9747AS"
c) Remove ‘ICT’ from the institutes
➥ Give me the solution
ZP11[1][2:3] = []
d) Add ‘Wing D’ to the wings of the building
➥ Give me the solution
Again this is not possible because it is a tuplee) Swap the list ["Life Sciences", "Building env", "ICT", "Engineering"]
for ["Economics]
➥ Give me the solution
Again this is not possible because it is a tuple. This fails: ZP11[1] = ["Economics"]
but there is a workaround!
ZP11[1][:] = ["Economics"]
This works because you can swap the contents of the entire list!
ZP11 = ({'street': 'Zernikeplein',
'number': 11},
["Life Sciences", "Building env", "ICT", "Engineering"],
('Wing A', 'Wing B', 'Wing C', 'Wing H'))
# YOUR CODE
13.3.9. sets#
Create a set named fruits_a
that has the values ‘apple’, ‘pear’ and ‘banana’. Create a second set, fruits_b
, that holds the values ‘banana’, ‘guava’ and ‘orange’.
Find the union, intersection and difference (both ways) between these sets.
➥ Give me a hint!
help(set)
Next, study the docs and find out
how to empty a set
what the difference is between
discard()
andremove()
how to find out whether one set is present within another set.
Demonstrate all these with code examples.
13.3.10. dict#
There are (at least) three ways to create and fill a dict. Use the suggested resources of chapter 1 to find them. Demonstrate these techniques to create a variable named inventory
holding this dict:
{513: 'hammer', 322: 'screwdriver', 462: 'nailgun'}
➥ Give me a hint!
Use a literal with the format `inventory = {key1: value1, key2: value2}`.➥ Give me another hint!
Create an empty dict, `inventory = dict()` and add individual items like this `inventory[513] = 'hammer'`.➥ Give me the complete solution!
inventory = {513: 'hammer', 322: 'screwdriver', 462: 'nailgun'}
print(inventory)
inventory = dict()
inventory[513] = 'hammer'
inventory[322] = 'screwdriver'
inventory[462] = 'nailgun'
print(inventory)
inventory = dict([[513, 'hammer'], [322, 'screwdriver'], [462, 'nailgun']])
print(inventory)
This technique uses a list of 2-element-lists passed as argument to the dict()
function..
13.4. Flow control#
13.4.1. if/else (1)#
Using the input()
function, ask the user their height in meters.
If the given height is below 1.2 meters or above 2.5 meters, give the message “are you sure you are not mistaken?”
If the height is between 1.2 meters and 2.5 meters, don’t give any warning.
In both cases, give their height in inches (one inch is 2.54 centimetres).
➥ Give me a hint!
height = input('Please give your height in meters: ')
height = int(height)
13.4.2. if/else (2)#
Using the input()
function, ask the user their full name. Check the number of separate names. For instance, my full name is Michiel Andries Noback (don’t tell anybody)! So the count is 3. If the count is higher than 3, ask the user whether they are catholic. If the answer is “no”, you must conclude they are royalty. If the count is 2 or 3, conclude they are a commoner. With a count of 1, they must be a popstar!
➥ Give me a hint!
name = input('Please give your full name: ')
if len(name.split(" ")) > 3:
pass
13.4.3. if/else (3)#
Using the input()
function, ask the user whether they like fish. Convert the input to lowercase so “Yes”, “yes” and “YES” are all correct. Check the resulting answer (it should be yes or no, nothing else).
Next, ask the user whether they like meat. With these two inputs, give a pizza suggestion (e.g. vegetariana for people who said “no” to both questions). Look up a pizza menu from your favourite restaurant.
Feel free to adjust or expand to your liking!
➥ Give me a hint!
correct_answers = {'yes', 'no'}
fish = input('do you like fish on your pizza [y/n]?')
if not fish.lower() in correct_answers: #or if fish.lower() != 'yes' or fish.lower() != 'no':
fish = input('Only "yes" or "no" allowed! Do you like fish on your pizza [y/n]?')
13.4.4. Looping with for
(1)#
Using the range()
function, the for
loop and the format string (f'text and {variable}'
) to create the output listed below.
3 : 9 : 1.7320508075688772
6 : 36 : 2.449489742783178
9 : 81 : 3.0
12 : 144 : 3.4641016151377544
➥ Give me a hint
for number in range(start, stop, step):
pass
# YOUR CODE
13.4.5. Looping with for
(2)#
From this list, [3, 6, 8, 2, 7, 5, 1, 4]
, create a list holding tuples of each consecutive pair of numbers, like this: [(3, 6), (8, 2), (7, 5), (1, 4)]
➥ Give me the solution
Many solutions possible; here are a few.result = list()
for i in range(0, len(l), 2):
result.append((l[i], l[i+1]))
result
or
l = [3, 6, 8, 2, 7, 5, 1, 4]
result = list()
for i in range(len(l)):
if i % 2 == 0:
result.append((l[i], l[i+1]))
result
or
result = list()
index = 0 # enumerate() is better but not dealt with
for n in l:
if index % 2 == 0:
result.append((l[index], l[index+1]))
index += 1
result
l = [3, 6, 8, 2, 7, 5, 1, 4]
# YOUR CODE
13.4.6. Looping with for
(3)#
Given the text presented below, report
the number of sentences
the number of words in each sentence
the words that are repeated at least once
the count of each letter (challenge: only alphabet characters) There are several looping scenarios to address both word count and letter count: nested or after each other.
➥ Give me a hint!
Besides somesplit()
ting and looping, this assignment requires the use of sets and/or dictionaries.
➥ Give me another hint!
letters = dict()
sentences = text.split(".")
for sentence_count, sentence in enumerate(sentences):
pass
➥ How do I count frequencies?
letter_freq = dict()
for char in text:
letter_freq.setdefault(char, 0)
# better than
#if char not in letter_freq:
# letter_freq[char] = 0
letter_freq[char] += 1
This is out of scope for this course, but too nice to leave unmentioned. See alse
here
from collections import Counter
Counter(text)
text = """Python is a high-level, general-purpose programming language.
Its design philosophy emphasizes code readability with the use of significant indentation.
Python is dynamically-typed and garbage-collected.
It supports multiple programming paradigms, including structured (particularly procedural),
object-oriented and functional programming."""
# YOUR CODE
13.4.7. Looping with while
#
Ask the user a for username of at least 6 characters long, consisting of only alphabet characters or number. Repeat the question as long as there is no correct username provided. If the input is correct, print “You are registered as <USERNAME>!” and exit the loop. If the input is empty, print “Registration cancelled.” and exit as well.
(NB this is not nice to do in Visual Studio Code)
➥ Give me a hint
This is a typical use case for `while True:`.➥ Give me the solution
while True:
username = input("Please enter a username")
print(f'You entered "{username}"')
if len(username) == 0:
print("Registration cancelled")
break
if len(username) > 6 and username.isalnum():
print(f'You are registered as "{username}"')
break
else:
print(f'Username incorrect: "{username}"; should be at least 6 characters of only letters and digits')
13.5. Functions#
13.5.1. Speed to distance#
Write a function, travelled_distance()
that accepts a speed (km/h) and an elapsed time (sec) and reports the travelled distance in meters, rounded to 2 decimals using the round()
function, like this:
With speed 10 km/h and elapsed time 25 sec, the travelled distance is 69.44 meters
➥ Give me a hint
speed_m_sec = (speed * 1000)/3600
# YOUR CODE
13.5.2. Refactor to use functions#
Refactor the assignment “Looping with for
(2)” to use functions for each sub-assignment. You will end up with at least 4 functions, something like this:
count_sentences(text)
: counts the number of sentencescount_words(sentence)
: counts the words in a sentenceprocess_words(sentence, word_dict)
: Counts the different words in a sentence. Ifword_dict
isNone
, create a new one, else use the given dict!get_letter_frequency(sentence, letter_dict)
: Determines the letter frequencies in the sentence. Ifletter_dict
isNone
, create a new one, else use the given dict!
You can ignore the fact that this may not be the most efficient way to get all statistics in a single analysis. Feel free to be creative in your refactoring.
13.5.3. Unit conversions#
Write a function that can be used to convert units in different temperature scales between each other.
For instance, temperature in degrees Fahrenheit to Celsius is °F = (°C * 9/5) + 32
.
Conversely, °C = (°F - 32) * 5/9
. And Kelvin to Celsius: K = C + 273.15
.
Your function should take three arguments: the input temp, the origin scale and the destination scale.
It should print the converted value as well as the input and output types.
13.5.4. DNA translations#
In its most basic form, DNA can be seen as a sequence of 3-letter words (called codons), each of which encodes a single amino acid or a stop signal.
So, this short DNA sequence ATGCCGGGCTAA
can be translated into Met-Pro-Gly-Stop. Or, in single-letter encoding MPG*
. For details on this central dogma of molecular biology, see here.
The snippet below generates the DNA codon translation table that can be used to translate DNA into protein.
bases = "tcag"
codons = [a + b + c for a in bases for b in bases for c in bases]
amino_acids = 'FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG'
codon_table = dict(zip(codons, amino_acids))
It is your task to create a function that can translate DNA into protein. The function should receive two arguments: the DNA sequence (seq
) and the position to start translation at, start
. This last argument should default to 0 (the first nucleotide).
Here is a test sequence you can use (parts of):
CATCATGAAATCGCTTGTCGCACTACTGCTGCTTTTAGTCGCTACTTCTGCCTTTGCTGACCAGTATGTAAATGGCTACA
CTAGAAAAGACGGAACTTATGTCAACGGCTATAC
As an optional extra argument you could implement that the function exits and returns the protein sequence when a stop (*
) is encountered, defaulting on False
.
➥ Give me a hint
To get hold of a codon, you will need something like `codon = seq[i: i+3]`➥ Give me the solution
def translate(seq, start=0, exit_on_stop=False):
seq = seq.lower().replace('\n', '')
peptide = ''
for i in range(start, len(seq), 3):
codon = seq[i: i+3]
amino_acid = codon_table.get(codon, '*') # Default return value when the codon is not three letters
peptide += amino_acid
if amino_acid == '*' and exit_on_stop:
break
return peptide
dna = "CATCATGAAATCGCTTGTCGCACTACTGCTGCTTTTAGTCGCTACTTCTGCCTTTGCTGACCAGTATGTAAATGGCTACACTAGAAAAGACGGAACTTATGTCAACGGCTATAC"
translate(dna, start = 2, exit_on_stop=False)
13.6. Reading and Writing files#
13.6.1. Patient before & after treatment data#
In the data
folder of this repo there is a file named patients_before_after.csv. If this assignment is viewed within a browser window wou can click on the link to access it.
Write a function named process_patient_data()
that
Reads in the data
Calculates the difference between “Before” and “After”
Writes the result to a file named
patient_data_processed.csv
with this format:
Patient,Difference
1,23
2,40
3,57
...
Hint: writing a new line to file is done with "\n"
.
➥ Give me the solution!
input_file = "./data/patients_before_after.csv"
output_file = "./data/patients_diff.csv"
in_handle = open(input_file, "r")
out_handle = open(output_file, "w")
out_handle.write("Patient,Difference\n")
#without enumerate()
count = 0
for line in in_handle:
count += 1
if (count == 1): continue
line = line.strip()
#automatic unpacking
(patient, before, after) = line.split(",")
before = int(before)
after = int(after)
diff = before - after
out_handle.write(f'{patient},{diff}\n')
print(patient, before, after, diff)
in_handle.close()
out_handle.close()
# YOUR CODE
13.6.2. Drug trial data#
In the data
folder of this repo there is a file named placebo_drug_test.csv. If this assignment is viewed within a browser window wou can click on the link to access it.
Write a function named analyse_drug_data()
that reads in the file and returns - as a dict data structure - the following information:
The number of subjects in the data
The mean, minimum and maximum of the Placebo treatment
The mean, minimum and maximum of the Valproate treatment
# YOUR CODE
13.7. Core library functions#
13.7.1. enumerate()
#
Look at the solution for exercise 05.1 and refactor it so that the enumerate function is used instead of this construct:
count = 0
for line in in_handle:
count += 1
#more code
13.7.2. chr()
#
Below is a secret message encoded in numbers. Each number represents a letter.
Can you crack it with the chr()
function?
Note, you will also need to use int()
➥ Give me the solution
encrypted = '73|32|104|111|112|101|32|71|111|111|103|108|101|32|105|115|32|109|111|114|101|32|99|97|114|101|102|117|108|108|32|119|105|116|104|32|109|121|32|100|97|116|97|33'
decrypted = ""
for c in encrypted.split("|"):
decrypted += chr(int(c))
decrypted
# The encoded message:
encrypted = '73|32|104|111|112|101|32|71|111|111|103|108|101|32|105|115|32|109|111|114|101|32|99|97|114|101|102|117|108|108|32|119|105|116|104|32|109|121|32|100|97|116|97|33'
13.7.3. ord()
#
Do the reverse of the above exercise: create an encrypted message from the given text.
Write a function for this task that accepts as arguments the message to encryp and the separator to use. The separator should default to '|'
. Note, you will also need to use str()
.
➥ Give me the solution
def encrypt(message, separator = '|'):
result = list()
for c in message:
result.append(str(ord(c)))
#the join method is more efficient than "text += text" concatenation
return separator.join(result)
message = "Keep it secret, keep is safe!"
print(encrypt(message))
message = "Keep it secret, keep is safe!"
# YOUR CODE
13.7.4. Sorting#
Given the list of tuples below, holding first names, last names, ages, lengths and weights of persons, sort it according to
a) First name
➥ Give me the solution
# default behaviour is already correct!
sorted(persons)
b) Age (from high to low)
➥ Give me the solution
sorted(persons, key = lambda person: person[2], reverse=True)
c) Last name and first name
➥ Give me the solution
# create a key of two person elements: last and then first name
sorted(persons, key = lambda person: (person[1], person[0]))
d) The person’s BMI
➥ Give me the solution
# publish a key representing a filed that is not actually present
sorted(persons, key = lambda person: person[4]/(person[3]**2) )
persons = [('John', 'Doe', 57, 180, 80),
('Anna', 'Doe', 62, 190, 97),
('Allie', 'Zandt', 42, 176, 78),
('Roger', 'Marre', 35, 181, 72),
("Z'duru", 'Ambarda', 39, 166, 70)]
# YOUR CODE
13.7.5. The sys
module#
Using the correct output streams, print messages to get output that looks like this:
➥ Give me the solution
import sys
print("Welcome to flight 464 to Amsterdam")
print("Uhm, I think we just lost an engine", file=sys.stderr)
print("Oh my God we are all going to die!", file=sys.stderr)
# YOUR CODE
13.7.7. The csv
module#
Repeat exercise 05.2 but this time use the csv module.
13.8. Comprehensions#
13.8.1. Starting simple#
Using comprehensions and the range(8)
function call as basic iterator, generate the following lists:
a) [2, 2, 2, 2, 2, 2, 2, 2]
➥ Give me the solution
[2 for x in range(8)]
b) [0, 1, 4, 9, 16, 25, 36, 49]
➥ Give me the solution
[x**2 for x in range(8)]
c) [0, 1, 2, 3, 4]
➥ Give me the solution
[x for x in range(8) if x < 5]
d) [1, 3, 5, 7]
➥ Give me the solution
[x for x in range(10) if x % 2 == 1]
e) [10, 12, 14, 16]
➥ Give me the solution
[x+10 for x in range(8) if x%2==0]
#YOUR CODE
13.8.2. Some more basic listcomps#
a) Go from this [1, 3, "H", 4, "K"]
to this [1, 9, "hh", 16, "kk"]
➥ Give me the solution
my_data = [1, 3, "H", 4, "K"]
[e.lower()*2 if type(e) == str else e**2 for e in my_data]
my_data = [1, 3, "H", 4, "K"]
# YOUR CODE
b) Go from this xy_coords = [(3, 4), (1, 5), (6, 4), (5, 1)]
to this [1, 4, 2]
; i.e. calculate the absolute difference between x and y of each coordinate.
➥ Give me the solution
[abs(x - y) for x, y in xy_coords if x > 1 and y > 1]
# or, alternatively
[abs(t[0] - t[1]) for t in xy_coords if t[0] > 1 and t[1] > 1]
xy_coords = [(3, 4), (1, 5), (6, 4), (5, 1)]
# YOUR CODE
13.8.3. Some more juice in comprehensions#
These exercises should all be solved using comprehensions.
a) from this list, [3, 1, 6, 5, 2]
create the list [[0, 1, 2], [0], [0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 4], [0, 1]]
➥ Give me the solution
l = [3, 1, 6, 5, 2]
[[x for x in range(n)] for n in l]
b) from this list, [3, 1, 6, 5, 2]
create the list [[2, 1, 0], [0], [5, 4, 3, 2, 1, 0], [4, 3, 2, 1, 0], [1, 0]]
➥ Give me the solution
l = [3, 1, 6, 5, 2]
[[x for x in reversed(range(n))] for n in l]
c) from this list, [3, 1, 6, 5, 2]
create the dict {3: 3, 1: 0, 6: 15, 5: 10, 2: 1}
. This is the sum of 0 (or 1) to the corresponding number.
➥ Give me the solution
l = [3, 1, 6, 5, 2]
{n:[x for x in reversed(range(n))] for n in l}
l = [3, 1, 6, 5, 2]
# YOUR CODE
13.9. The exception mechanism#
13.9.1. From output to implementation#
Given the output below, which is the result from running a cell, implement the code that will generate exactly this output.
13.9.2. Add error handling#
In the data
folder you will find a file, dirty_data.csv. This serves as input to the Python script add_error_handling.py that is present in the scripts
folder. This is the script:
def read_file(file):
'''Reads in the given data and returns a list of tuples,
where each tuple contains exactly 3 numbers'''
result = list()
with open(file) as f:
for line in f:
print(f)
return result
def process_numbers(numbers):
'''Receives a list of tuples of 3 numbers each.
Then calculates the first number divided by the second number,
and this times third number for each tuple.
Returns a List of results.'''
for t in numbers:
pass
def main(args):
'''Receives the command-line argument with file and
processes this with the two functions.
Then reports the average of all processed cases.'''
numbers = read_file(args[1])
processed = process_numbers(numbers)
if __name__ == "__main__":
'''main entry point'''
import sys
main(sys.argv)
It is your task to add all types of error handling that will make this a robust piece of functionality. This involves catching and dealing with exceptions/errors, but also verifying user input and dealing with erroneous input in a correct and friendly manner.
13.10. Regular Expressions#
13.10.1. Restriction enzymes#
For each of these restriction enzyme recognition sequences, write the corresponding best / most specific / most concise regex. See here for the IUPAC ambiguity codes.
challenge write a function that does this automatically.
GAATTC (ECORI)
CCWGG (EcoRII)
GGATCNNNN (AlwI)
GGYRCC (BanI)
GCCNNNNNGGC (BglI)
➥ Give me the solution
ecor_r1 = "GA{2}T{2}C
eco_r2 = "CC[AT]GG"
alw_1 = "GGATC[GATC]{4}"
ban_1 = "GG[CT]CC"
bgl_1 = :"GCC[GATC]{5}GGC"
## YOUR CODE
13.10.2. Legal DNA and Protein#
DNA consists of the letters A, C, G, and T.
RNA consists of the letters A, C, G, and U.
Protein sequences can have the characters A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, and Y.
Using the re
module, write a function establishing what type of sequence is passed and returning ‘DNA’, ‘RNA’, ‘PROTEIN’ or ‘UNKNOWN’.
Demonstrate its usage within a list comprehension.
➥ Give me a hint
Use this:re.match('^YOUR PATTERN$', sequence):
➥ Give me the solution
def sequence_type(sequence):
if re.match('^[gatcGATC]+$', sequence):
return 'DNA'
if re.match('^[gaucGAUC]+$', sequence):
return 'RNA'
if re.match('^[ACDEFGHIKLMNPQRSTVWY]+$', sequence):
return 'protein'
return 'UNKNOWN'
[sequence_type(seq) for seq in sequences]
## some sequences
import re
sequences = ['CATCATGAAATCGCTTGTCGCACTACTGCTGCTTTTAGTCGCTACTTCTGCCTTTGCTGACCAGT', #DNA
'GACUAGCCAUUACGACGCAUUACAACGAUUAGACA', #RNA
'GPEAGQTVKHVHVHILPRKAGDFHRNDSIYDALEKHDREDKDSPALWRSEE', #PROTEIN
'THISISNOTABIOLOGICALSEQUENCE'] # UNKNOWN
## YOUR CODE
13.10.3. Dates#
In the Netherlands, we write dates as ‘DD/MM/YYYY’: ‘15/10/2024’ of ‘15-10-2024’. When the day or the month is single-digit, we ofte omit the leading zero (9
instead of 09
).
a) Write a pattern for Dutch dates and apply these to the given text below using re.findall()
.
➥ Give me the solution
re.findall("\d{1,2}[-/]\d{1,2}[-/]\d{4}", text)
text = """On 1-3-1947 the first foundations were laid for the European Union, in the Treaty of Brussels.
The Treaty of Rome, establishing the European Economic Community (EEC) was signed on 1-1-1957.
The date 1/11/1993 marked the start of the European Union. The Euro currency was adopted on 01-01-2002.
"""
## YOUR CODE
b) Using re.finditer()
and the correct Match object method(s), extract only the years of the dates out of the text, into a list.
Challenge: use a list comprehension for this.
➥ Give me the solution
[m.group(1) for m in re.finditer("\d{1,2}[-/]\d{1,2}[-/](\d{4})", text)]
## YOUR CODE
13.10.4. Telephone numbers#
In the Netherlands, phone numbers are written with an area code or mobile code (06) followed by the individual number. These two can be separated by a space or a hyphen. All phone numbers are 10 digits long. Besides this, phone numbers can be preceded by the country code, indicated by a +
with the country code. For simplicity’s sake, we’ll assume that all country codes are 2 digits long.
Given the list of phone numbers below,
phone_numbers = ['06-27635908',
'050-26653422',
'020-7654321',
'06-23456789',
'06 34567890',
'010 7736961',
'+31-30-4567892',
'+31 10 74638292',
'+31 70 3665287',
'+32 6 27830919']
a) Write a regular expression that describes all telephone numbers without country codes (numbers 1-6 above). Demonstrate its use with the list of numbers using re.match()
.
➥ Give me the solution
pattern = "\d{2,3}[\s-]\d{7,8}"
for pn in phone_numbers:
matched = re.match(pattern, pn)
if matched:
print(matched.group(0))
else:
print(f"no match: {pn}")
## YOUR CODE
b) Extend the regular expression from part a) to also include international-styled telephone numbers.
➥ Give me a hint
Use the `|`, "or" symbol for this regex.➥ Give me the solution
pattern = "\d{2,3}[\s-]\d{7,8}|\+\d{2}[\s-]\d{1,3}[ -]\d{7,8}"
for pn in phone_numbers:
matched = re.match(pattern, pn)
if matched:
print(matched.group(0))
else:
print(f"no match: {pn}")
## YOUR CODE
c) Extract both area codes and phone numbers with area codes and create a dict with a list of phone numbers per area. Treat 06 as a normal area.
➥ Give me a hint
Use the `()`, "grouping" symbols for this regex to extract parts of matches.➥ Give me the solution
results = dict()
pattern = "(\d{2,3})[\s-]\d{7,8}|\+\d{2}[\s-](\d{1,3})[ -]\d{7,8}"
for pn in phone_numbers:
matched = re.match(pattern, pn)
if matched:
whole = matched.group(0)
area = matched.group(1) if matched.group(2) is None else matched.group(2)
if not area.startswith('0'):
area = '0' + area
results.setdefault(area, []).append(whole)
else:
print(f"no match: {pn}")
print(results)
## YOUR CODE
13.11. Object-oriented Programming#
13.11.1. A Zoo class#
a). Study the docs of defaultdict
from module collections
here. You should use this container in this exercise. Create a Zoo class that can be used to hold a collection of zoo animals. Initialize it with a defaultdict
to hold animals that will be retrievable by species name.
➥ Give me the solution
from collections import defaultdict
class Zoo:
def __init__(self):
self.animals = defaultdict()
# YOUR CODE
b). Implement a method within class Zoo that can be used to add animals with a species name and animal name. It should have this signature:
def add_animal(self, species, name):
To demonstrate correctness, implement a method that can be used to fetch all names of animals of a given species, defaulting to an empty list if no such animal species is in the Zoo:
def get_animals(self, species):
Demonstrate: Create and fill the Zoo with some animals and fetch some existing and non-existing animals.
➥ Give me the solution
from collections import defaultdict
class Zoo:
def __init__(self):
self.animals = defaultdict()
def add_animal(self, species, name):
self.animals.setdefault(species, []).append(name)
def get_animals(self, species):
return self.animals.get(species, [])
z = Zoo()
z.add_animal("bear", "Jonas")
z.add_animal("bear", "Boris")
z.add_animal("parrot", "Tweety")
print(z.get_animals("bear"))
print(z.get_animals("lion"))
# YOUR CODE
c). Implement a string representation function (using the correct hook) that will display the following information when an instance of the class is printed:
Zoo {bear: Jonas & Boris; lemming: Peter & Roger & Anne; parrot: Tweety}
Note the animals are sorted by species name and animal name!
➥ Give me a hint
The function to be implemented is __str__()
.
Use "str".join(list)
to combine elements.
➥ Give me the solution
from collections import defaultdict
class Zoo:
## rest of code omitted!
def __str__(self):
str_repr = []
for a in sorted(self.animals.keys()):
sp = a + ": " + " & ".join(self.animals.get(a))
str_repr.append(sp)
return "Zoo {" + "; ".join(str_repr) + "}"
# YOUR CODE
d). Make the Zoo class iterable. When used in a for loop or other iteration context, instances should sequentially serve animals as a tuple. in this tuple, the first element should be a species name and the second the animal name. Extra credits to do this with comprehensions.
➥ Give me a hint
You should use the __iter__()
hook to serve a new data representation of the Zoo animals.
This needs a nested looping to create the requested datastructure - or a nested comprehension.
➥ Give me the solution
from collections import defaultdict
class Zoo:
def __init__(self):
self.animals = defaultdict()
def add_animal(self, species, name):
self.animals.setdefault(species, []).append(name)
def get_animals(self, species):
return self.animals.get(species, [])
def __str__(self):
str_repr = []
for a in sorted(self.animals.keys()):
sp = a + ": " + " & ".join(self.animals.get(a))
str_repr.append(sp)
return "Zoo {" + "; ".join(str_repr) + "}"
def __iter__(self):
## best but hard to read
iterator = [(species, animal) for species in self.animals for animal in self.animals[species]]
## more code but easy to read
#iterator = list()
#for species in self.animals:
# for animal in self.animals[species]:
# iterator.append((species, animal))
#
## delegate!
return iterator.__iter__()
zoo = Zoo()
zoo.add_animal("lemming", "Peter")
zoo.add_animal("lemming", "Roger")
zoo.add_animal("lemming", "Anne")
zoo.add_animal("bear", "Jonas")
zoo.add_animal("bear", "Boris")
zoo.add_animal("parrot", "Tweety")
for a in zoo:
print(a)
# YOUR CODE