# Python¶

## Lists¶

### Creating List: Manual Fill¶

lst = [0, 1, 2 ,3]
print(lst)

[0, 1, 2, 3]


### Creating List: List Comprehension¶

lst = [i for i in range(4)]
print(lst)

[0, 1, 2, 3]


### Joining List with Blanks¶

# To use .join(), your list needs to be of type string
lst_to_string = list(map(str, lst))

# Join the list of strings
lst_join = ' '.join(lst_to_string)
print(lst_join)

0 1 2 3


### Joining List with Comma¶

# Join the list of strings
lst_join = ', '.join(lst_to_string)
print(lst_join)

0, 1, 2, 3


### Checking Lists Equal: Method 1¶

Returns True if equal, and False if unequal

lst_unequal = [1, 1, 2, 3, 4, 4]
lst_equal = [0, 0, 0, 0, 0, 0]

print('-'*50)
print('Unequal List')
print('-'*50)

print(lst_unequal[1:])
print(lst_unequal[:-1])
bool_equal = lst_unequal[1:] == lst_unequal[:-1]
print(bool_equal)

print('-'*50)
print('Equal List')
print('-'*50)

print(lst_equal[1:])
print(lst_equal[:-1])
bool_equal = lst_equal[1:] == lst_equal[:-1]
print(bool_equal)

--------------------------------------------------
Unequal List
--------------------------------------------------
[1, 2, 3, 4, 4]
[1, 1, 2, 3, 4]
False
--------------------------------------------------
Equal List
--------------------------------------------------
[0, 0, 0, 0, 0]
[0, 0, 0, 0, 0]
True


### Checking Lists Equal: Method 2¶

Returns True if equal, and False if unequal. Here, all essentially checks that there is no False in the list.

print('-'*50)
print('Unequal List')
print('-'*50)

lst_check = [i == lst_unequal[0] for i in lst_unequal]
bool_equal = all(lst_check)
print(bool_equal)

print('-'*50)
print('Equal List')
print('-'*50)

lst_check = [i == lst_equal[0] for i in lst_equal]
bool_equal = all(lst_check)
print(bool_equal)

--------------------------------------------------
Unequal List
--------------------------------------------------
False
--------------------------------------------------
Equal List
--------------------------------------------------
True


## Sets¶

### Removing Duplicate from List¶

Sets can be very useful for quickly removing duplicates from a list, essentially finding unique values

lst_one = [1, 2, 3, 5]
lst_two = [1, 1, 2, 4]
lst_both = lst_one + lst_two
lst_no_duplicate = list(set(lst_both))

print(f'Original Combined List {lst_both}')
print(f'No Duplicated Combined List {lst_no_duplicate}')

Original Combined List [1, 2, 3, 5, 1, 1, 2, 4]
No Duplicated Combined List [1, 2, 3, 4, 5]


## Lambda, map, filter, reduce¶

### Lambda¶

The syntax is simple lambda your_variables: your_operation

add = lambda x, y: x + y

5


#### Multiply Function¶

multiply = lambda x, y: x * y
multiply(2, 3)

6


### Map¶

#### Create List¶

lst = [i for i in range(11)]
print(lst)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


#### Map Square Function to List¶

square_element = map(lambda x: x**2, lst)

# This gives you a map object
print(square_element)

# You need to explicitly return a list
print(list(square_element))

<map object at 0x7f08c8620438>
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


#### Create Multiple List¶

lst_1 = [1, 2, 3, 4]
lst_2 = [2, 4, 6, 8]
lst_3 = [3, 6, 9, 12]


#### Map Add Function to Multiple Lists¶

add_elements = map(lambda x, y, z : x + y + z, lst_1, lst_2, lst_3)

[6, 12, 18, 24]


### Filter¶

#### Create List¶

lst = [i for i in range(10)]
print(lst)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


#### Filter multiples of 3¶

multiples_of_three = filter(lambda x: x % 3 == 0, lst)
print(list(multiples_of_three))

[0, 3, 6, 9]


### Reduce¶

The syntax is reduce(function, sequence). The function is applied to the elements in the list in a sequential manner. Meaning if lst = [1, 2, 3, 4] and you have a sum function, you would arrive with ((1+2) + 3) + 4.

from functools import reduce
sum_all = reduce(lambda x, y: x + y, lst)
# Here we've 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9
print(sum_all)
print(1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9)

45
45


## Generators¶

• Why: generators are typically more memory-efficient than using simple for loops

• Imagine wanting to sum digits 0 to 1 trillion, using a list containing those numbers and summing them would be very RAM memory-inefficient.
• Using a generator would allow you to sum one digit sequentially, staggering the RAM memory usage in steps.
• What: generator basically a function that returns an iterable object where we can iterate one bye one

• Types: generator functions and generator expressions
• Dependencies: we need to install a memory profiler, so install via pip install memory_profiler

### Simple custom generator function example: sum 1 to 1,000,000¶

• What: let's create a simple generator, allowing us to iterate through the digits 1 to 1,000,000 (inclusive) one by one with an increment of 1 at each step and summing them
• How: 2 step process with a while and a yield
# Load memory profiler

# Here we take a step from 1
def create_numbers(end_number):
current_number = 1

# Step 1: while
while current_number <= end_number:
# Step 2: yield
yield current_number

current_number += 1

# Here we sum the digits 1 to 100 (inclusive) and time it
%memit total = sum(create_numbers(1e6))
print(total)

peak memory: 46.50 MiB, increment: 0.28 MiB
500000500000


#### Without generator function: sum with list¶

• Say we don't use a generator, and have a list of digits 0 to 1,000,000 (inclusive) in memory then sum them.
• Notice how this is double the memory than using a generator!
%memit total = sum(list(range(int(1e6) + 1)))
print(total)

peak memory: 85.14 MiB, increment: 38.38 MiB
500000500000


#### Without generator function: sum with for loop¶

• Say we don't use a generator and don't put all our numbers into a list
• Notiice how this is much better than summing a list but still worst than a generator in terms of memory?
def sum_with_loop(end_number):
total = 0
for i in range(end_number + 1):
i += 1
total += i

%memit total = sum_with_loop(int(1e6))
print(total)

peak memory: 54.49 MiB, increment: 0.00 MiB
500001500001


### Generator expression¶

• Like list/dictionary expressions, we can have generator expressions too
• We can quickly create generators this way, allowing us to make computations on the fly rather than pre-compute on a whole list/array of numbers
• This is more memory efficient
# Define the list
list_of_numbers = list(range(10))

# Find square root using the list comprehension
list_of_results = [number ** 2 for number in list_of_numbers]
print(list_of_results)

# Use generator expression to calculate the square root
generator_of_results = (number ** 2 for number in list_of_numbers)
print(generator_of_results)

for idx in range(10):
print(next(generator_of_results))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
<generator object <genexpr> at 0x7f08c85aa4f8>
0
1
4
9
16
25
36
49
64
81


## Decorators¶

• This allows us to to modify our original function or even entirely replace it without changing the function's code.
• It sounds mind-boggling, but a simple case I would like to illustrate here is using decorators for consistent logging (formatted print statements).
• For us to understand decorators, we'll first need to understand:
• first class objects
• *args
• *kwargs

### First Class Objects¶

def outer():
def inner():
print('Inside inner() function.')

# This returns a function.
return inner

# Here, we are assigning outer() function to the object call_outer.
call_outer = outer()

# Then we call call_outer()
call_outer()

Inside inner() function.


### *args¶

• This is used to indicate that positional arguments should be stored in the variable args
• * is for iterables and positional parameters
# Define dummy function
def dummy_func(*args):
print(args)

# * allows us to extract positional variables from an iterable when we are calling a function
dummy_func(*range(10))

# If we do not use *, this would happen
dummy_func(range(10))

# See how we can have varying arguments?
dummy_func(*range(2))

(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
(range(0, 10),)
(0, 1)


### **kwargs¶

• ** is for dictionaries & key/value pairs
# New dummy function
def dummy_func_new(**kwargs):
print(kwargs)

# Call function with no arguments
dummy_func_new()

# Call function with 2 arguments
dummy_func_new(a=0, b=1)

# Again, there's no limit to the number of arguments.
dummy_func_new(a=0, b=1, c=2)

# Or we can just pass the whole dictionary object if we want
new_dict = {'a': 0, 'b': 1, 'c': 2, 'd': 3}
dummy_func_new(**new_dict)

{}
{'a': 0, 'b': 1}
{'a': 0, 'b': 1, 'c': 2}
{'a': 0, 'b': 1, 'c': 2, 'd': 3}


### Decorators as Logger and Debugging¶

• A simple way to remember the power of decorators is that the decorator (the nested function illustrated below) can
• (1) access the passed arguments of the decorated function and
• (2) access the decorated function
• Therefore this allows us to modify the decorated function without changing the decorated function
# Create a nested function that will be our decorator
def function_inspector(func):
def inner(*args, **kwargs):
result = func(*args, **kwargs)
print(f'Function args: {args}')
print(f'Function kwargs: {kwargs}')
print(f'Function return result: {result}')
return result
return inner

# Decorate our multiply function with our logger for easy logging
# Of arguments pass to the function and results returned
@function_inspector
def multiply_func(num_one, num_two):
return num_one * num_two

multiply_result = multiply_func(num_one=1, num_two=2)

Function args: ()
Function kwargs: {'num_one': 1, 'num_two': 2}
Function return result: 2


## Dates¶

### Get Current Date¶

import datetime
now = datetime.datetime.now()
print(now)

2019-08-12 14:20:45.604849


### Get Clean String Current Date¶

# YYYY-MM-DD
now.date().strftime('20%y-%m-%d')

'2019-08-12'


# Number of business days in a month from Jan 2019 to Feb 2019
import numpy as np
days = np.busday_count('2019-01', '2019-02')
print(days)

23


## Progress Bars¶

### TQDM¶

Simple progress bar via pip install tqdm

from tqdm import tqdm
import time
for i in tqdm(range(100)):
time.sleep(0.1)
pass

100%|██████████| 100/100 [00:10<00:00,  9.91it/s]


## Check Paths¶

### Check Path Exists¶

• Check if directory exists
import os
directory='new_dir'
print(os.path.exists(directory))

# Magic function to list all folders
!ls -d */

False
ls: cannot access '*/': No such file or directory


### Check Path Exists Otherwise Create Folder¶

• Check if directory exists, otherwise make folder
if not os.path.exists(directory):
os.makedirs(directory)

# Magic function to list all folders
!ls -d */

# Remove directory
!rmdir new_dir

new_dir/


## Exception Handling¶

### Try, Except, Finally: Error¶

• This is very handy and often exploited to patch up (save) poorly written code
• You can use general exceptions or specific ones like ValueError, KeyboardInterrupt and MemoryError to name a few
value_one = 'a'
value_two = 2

# Try the following line of code
try:
final_sum = value_one / value_two
print('Code passed!')
# If the code above fails, code nested under except will be executed
except:
print('Code failed!')
# This will run no matter whether the nested code in try or except is executed
finally:
print('Ran code block regardless of error or not.')

Code failed!
Ran code block regardless of error or not.


### Try, Except, Finally: No Error¶

• There won't be errors because you can divide 4 with 2
value_one = 4
value_two = 2

# Try the following line of code
try:
final_sum = value_one / value_two
print('Code passed!')
# If the code above fails, code nested under except will be executed
except:
print('Code failed!')
# This will run no matter whether the nested code in try or except is executed
finally:
print('Ran code block regardless of error or not.')

Code passed!
Ran code block regardless of error or not.


### Assertion¶

• This comes in handy when you want to enforce strict requirmenets of a certain value, shape, value type, or others
for i in range(10):
assert i <= 5, 'Value is more than 5, rejected'
print(f'Passed assertion for value {i}')

Passed assertion for value 0
Passed assertion for value 1
Passed assertion for value 2
Passed assertion for value 3
Passed assertion for value 4
Passed assertion for value 5

---------------------------------------------------------------------------

AssertionError                            Traceback (most recent call last)

<ipython-input-2-d9d077e139a9> in <module>
1 for i in range(10):
----> 2     assert i <= 5, 'Value is more than 5, rejected'
3     print(f'Passed assertion for value {i}')

AssertionError: Value is more than 5, rejected


## Asynchronous¶

### Concurrency, Parallelism, Asynchronous¶

• Concurrency (single CPU core): multiple threads on a single core running in sequence, only 1 thread is making progress at any point
• Think of 1 human, packing a box then wrapping the box
• Parallelism (mutliple GPU cores): multiple threads on multiple cores running in parallel, multiple threads can be making progress
• Think of 2 humans, one packing a box, another wrapping the box
• Asynchronous: concurrency but with a more dynamic system that moves amongst threads more efficiently rather than waiting for a task to finish then moving to the next task
• Python's asyncio allows us to code asynchronously
• Benefits:
• Scales better if you need to wait on a lot of processes
• Less memory (easier in this sense) to wait on thousands of co-routines than running on thousands of threads
• Good for IO bound uses like reading/saving from databases while subsequently running other computation
• Easier management than multi-thread processing like in parallel programming
• In the sense that everything operates sequentially in the same memory space

### Asynchronous Key Components¶

• The three main parts are (1) coroutines and subroutines, (2) event loops, and (3) future.
• Co-routine and subroutines
• Subroutine: the usual function
• Coroutine: this allows us to maintain states with memory of where things stopped so we can swap amongst subroutines
• async declares a function as a coroutine
• await to call a coroutine
• Event loops
• Future

### Synchronous 2 Function Calls¶

import timeit
time.sleep(1)
return num_1 + num_2

def display_sum(num_1, num_2):
print(f'Total sum {total_sum}')

def main():
display_sum(2, 2)
display_sum(2, 2)

start = timeit.default_timer()

main()

end = timeit.default_timer()
total_time = end - start

print(f'Total time {total_time:.2f}s')

Adding
Total sum 4
Total sum 4
Total time 2.00s


### Parallel 2 Function Calls¶

from multiprocessing import Pool
from functools import partial

start = timeit.default_timer()

pool = Pool()
result = pool.map(partial(display_sum, num_2=2), [2, 2])

end = timeit.default_timer()
total_time = end - start

print(f'Total time {total_time:.2f}s')

Adding
Total sum 4
Total sum 4
Total time 1.08s


### Asynchronous 2 Function Calls¶

For this use case, it'll take half the time compared to a synchronous application and slightly faster than parallel application (although not always true for parallel except in this case)

import asyncio
import timeit
import time

await asyncio.sleep(1)
return num_1 + num_2

async def display_sum(num_1, num_2):
print(f'Total sum {total_sum}')

async def main():
# .gather allows us to group subroutines
await asyncio.gather(display_sum(2, 2),
display_sum(2, 2))

start = timeit.default_timer()

# For .ipynb, event loop already done
await main()

# For .py
# asyncio.run(main())

end = timeit.default_timer()
total_time = end - start

print(f'Total time {total_time:.4f}s')

Adding