Python¶

Lists¶

Creating List: Manual Fill¶

lst = [0, 1, 2 ,3]
print(lst)

[0, 1, 2, 3]

Creating List: List Comprehension¶

lst = [i for i in range(4)]
print(lst)

[0, 1, 2, 3]

Joining List with Blanks¶

# To use .join(), your list needs to be of type string
lst_to_string = list(map(str, lst))

# Join the list of strings
lst_join = ' '.join(lst_to_string)
print(lst_join)

0 1 2 3

Joining List with Comma¶

# Join the list of strings
lst_join = ', '.join(lst_to_string)
print(lst_join)

0, 1, 2, 3

Checking Lists Equal: Method 1¶

Returns True if equal, and False if unequal

lst_unequal = [1, 1, 2, 3, 4, 4]
lst_equal = [0, 0, 0, 0, 0, 0]

print('-'*50)
print('Unequal List')
print('-'*50)

print(lst_unequal[1:])
print(lst_unequal[:-1])
bool_equal = lst_unequal[1:] == lst_unequal[:-1]
print(bool_equal)

print('-'*50)
print('Equal List')
print('-'*50)

print(lst_equal[1:])
print(lst_equal[:-1])
bool_equal = lst_equal[1:] == lst_equal[:-1]
print(bool_equal)

--------------------------------------------------
Unequal List
--------------------------------------------------
[1, 2, 3, 4, 4]
[1, 1, 2, 3, 4]
False
--------------------------------------------------
Equal List
--------------------------------------------------
[0, 0, 0, 0, 0]
[0, 0, 0, 0, 0]
True

Checking Lists Equal: Method 2¶

Returns True if equal, and False if unequal. Here, all essentially checks that there is no False in the list.

print('-'*50)
print('Unequal List')
print('-'*50)

lst_check = [i == lst_unequal[0] for i in lst_unequal]
bool_equal = all(lst_check)
print(bool_equal)

print('-'*50)
print('Equal List')
print('-'*50)

lst_check = [i == lst_equal[0] for i in lst_equal]
bool_equal = all(lst_check)
print(bool_equal)

--------------------------------------------------
Unequal List
--------------------------------------------------
False
--------------------------------------------------
Equal List
--------------------------------------------------
True

Sets¶

Removing Duplicate from List¶

Sets can be very useful for quickly removing duplicates from a list, essentially finding unique values

lst_one = [1, 2, 3, 5]
lst_two = [1, 1, 2, 4]
lst_both = lst_one + lst_two
lst_no_duplicate = list(set(lst_both))

print(f'Original Combined List {lst_both}')
print(f'No Duplicated Combined List {lst_no_duplicate}')

Original Combined List [1, 2, 3, 5, 1, 1, 2, 4]
No Duplicated Combined List [1, 2, 3, 4, 5]

Lambda, map, filter, reduce, partial¶

Lambda¶

The syntax is simple lambda your_variables: your_operation

Add Function¶

add = lambda x, y: x + y
add(2, 3)

Multiply Function¶

multiply = lambda x, y: x * y 
multiply(2, 3)

Map¶

Create List¶

lst = [i for i in range(11)]
print(lst)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Map Square Function to List¶

square_element = map(lambda x: x**2, lst)

# This gives you a map object
print(square_element)

# You need to explicitly return a list
print(list(square_element))

<map object at 0x7f08c8620438>
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

Create Multiple List¶

lst_1 = [1, 2, 3, 4]
lst_2 = [2, 4, 6, 8]
lst_3 = [3, 6, 9, 12]

Map Add Function to Multiple Lists¶

add_elements = map(lambda x, y, z : x + y + z, lst_1, lst_2, lst_3)
print(list(add_elements))

[6, 12, 18, 24]

Filter¶

Create List¶

lst = [i for i in range(10)]
print(lst)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Filter multiples of 3¶

multiples_of_three = filter(lambda x: x % 3 == 0, lst)
print(list(multiples_of_three))

[0, 3, 6, 9]

Reduce¶

The syntax is reduce(function, sequence). The function is applied to the elements in the list in a sequential manner. Meaning if lst = [1, 2, 3, 4] and you have a sum function, you would arrive with ((1+2) + 3) + 4.

from functools import reduce
sum_all = reduce(lambda x, y: x + y, lst)
# Here we've 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9
print(sum_all)
print(1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9)

45
45

Partial¶

Allows us to predefine and freeze a function's argument. Combined with lambda, it allows us to have more flexibility beyond lambda's restriction of a single line.

from functools import partial

def display_sum_three(a, b, c):
    sum_all = a + b + c
    print(f'Sum is {sum_all}')

fixed_args_func = partial(display_sum_three, b=3, c=4)

# Given fixed arguments b=3 and c=4
# We add the new variable against the fixed arguments
var_int = 1
fixed_args_func(var_int)

# More advanced mapping with partial
# Add a variable from 0 to 9 to the constants
print('-'*50)
_ = list(map(fixed_args_func, list(range(10))))

# How about using with lambda to modifying constants without
# declaring your function again?
print('-'*50)
_ = list(map(lambda x: fixed_args_func(x, b=2), list(range(10))))

Sum is 8
--------------------------------------------------
Sum is 7
Sum is 8
Sum is 9
Sum is 10
Sum is 11
Sum is 12
Sum is 13
Sum is 14
Sum is 15
Sum is 16
--------------------------------------------------
Sum is 6
Sum is 7
Sum is 8
Sum is 9
Sum is 10
Sum is 11
Sum is 12
Sum is 13
Sum is 14
Sum is 15

Generators¶

Why: generators are typically more memory-efficient than using simple for loops
- Imagine wanting to sum digits 0 to 1 trillion, using a list containing those numbers and summing them would be very RAM memory-inefficient.
- Using a generator would allow you to sum one digit sequentially, staggering the RAM memory usage in steps.
What: generator basically a function that returns an iterable object where we can iterate one bye one
Types: generator functions and generator expressions
Dependencies: we need to install a memory profiler, so install via pip install memory_profiler

Simple custom generator function example: sum 1 to 1,000,000¶

What: let's create a simple generator, allowing us to iterate through the digits 1 to 1,000,000 (inclusive) one by one with an increment of 1 at each step and summing them
How: 2 step process with a while and a yield

# Load memory profiler
%load_ext memory_profiler

# Here we take a step from 1
def create_numbers(end_number):
    current_number = 1

    # Step 1: while
    while current_number <= end_number:
        # Step 2: yield
        yield current_number

        # Add to current number
        current_number += 1

# Here we sum the digits 1 to 100 (inclusive) and time it
%memit total = sum(create_numbers(1e6))
print(total)

peak memory: 46.50 MiB, increment: 0.28 MiB
500000500000

Without generator function: sum with list¶

Say we don't use a generator, and have a list of digits 0 to 1,000,000 (inclusive) in memory then sum them.
Notice how this is double the memory than using a generator!

%memit total = sum(list(range(int(1e6) + 1)))
print(total)

peak memory: 85.14 MiB, increment: 38.38 MiB
500000500000

Without generator function: sum with for loop¶

Say we don't use a generator and don't put all our numbers into a list
Notiice how this is much better than summing a list but still worst than a generator in terms of memory?

def sum_with_loop(end_number):
    total = 0
    for i in range(end_number + 1):
        i += 1
        total += i

    return total

%memit total = sum_with_loop(int(1e6))
print(total)

peak memory: 54.49 MiB, increment: 0.00 MiB
500001500001

Generator expression¶

Like list/dictionary expressions, we can have generator expressions too
We can quickly create generators this way, allowing us to make computations on the fly rather than pre-compute on a whole list/array of numbers
- This is more memory efficient

# Define the list
list_of_numbers = list(range(10))

# Find square root using the list comprehension
list_of_results = [number ** 2 for number in list_of_numbers]
print(list_of_results)

# Use generator expression to calculate the square root
generator_of_results = (number ** 2 for number in list_of_numbers)
print(generator_of_results)

for idx in range(10):
    print(next(generator_of_results))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
<generator object <genexpr> at 0x7f08c85aa4f8>
0
1
4
9
16
25
36
49
64
81

Decorators¶

This allows us to to modify our original function or even entirely replace it without changing the function's code.
It sounds mind-boggling, but a simple case I would like to illustrate here is using decorators for consistent logging (formatted print statements).
For us to understand decorators, we'll first need to understand:
- first class objects
- *args
- *kwargs

First Class Objects¶

def outer():
    def inner():
        print('Inside inner() function.')

    # This returns a function.
    return inner

# Here, we are assigning `outer()` function to the object `call_outer`.
call_outer = outer()

# Then we call `call_outer()` 
call_outer()

Inside inner() function.

*args¶

This is used to indicate that positional arguments should be stored in the variable args
* is for iterables and positional parameters

# Define dummy function
def dummy_func(*args):
    print(args)

# * allows us to extract positional variables from an iterable when we are calling a function
dummy_func(*range(10))

# If we do not use *, this would happen
dummy_func(range(10))

# See how we can have varying arguments?
dummy_func(*range(2))

(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
(range(0, 10),)
(0, 1)

**kwargs¶

** is for dictionaries & key/value pairs

# New dummy function
def dummy_func_new(**kwargs):
    print(kwargs)

# Call function with no arguments
dummy_func_new()

# Call function with 2 arguments
dummy_func_new(a=0, b=1)

# Again, there's no limit to the number of arguments.
dummy_func_new(a=0, b=1, c=2)

# Or we can just pass the whole dictionary object if we want
new_dict = {'a': 0, 'b': 1, 'c': 2, 'd': 3}
dummy_func_new(**new_dict)

{}
{'a': 0, 'b': 1}
{'a': 0, 'b': 1, 'c': 2}
{'a': 0, 'b': 1, 'c': 2, 'd': 3}

Decorators as Logger and Debugging¶

A simple way to remember the power of decorators is that the decorator (the nested function illustrated below) can
- (1) access the passed arguments of the decorated function and
- (2) access the decorated function
Therefore this allows us to modify the decorated function without changing the decorated function

# Create a nested function that will be our decorator
def function_inspector(func):
    def inner(*args, **kwargs):
        result = func(*args, **kwargs)
        print(f'Function args: {args}')
        print(f'Function kwargs: {kwargs}')
        print(f'Function return result: {result}')
        return result
    return inner

# Decorate our multiply function with our logger for easy logging
# Of arguments pass to the function and results returned
@function_inspector
def multiply_func(num_one, num_two):
    return num_one * num_two

multiply_result = multiply_func(num_one=1, num_two=2)

Function args: ()
Function kwargs: {'num_one': 1, 'num_two': 2}
Function return result: 2

Dates¶

Get Current Date¶

import datetime
now = datetime.datetime.now()
print(now)

2019-08-12 14:20:45.604849

Get Clean String Current Date¶

# YYYY-MM-DD
now.date().strftime('20%y-%m-%d')

'2019-08-12'

Count Business Days¶

# Number of business days in a month from Jan 2019 to Feb 2019
import numpy as np
days = np.busday_count('2019-01', '2019-02')
print(days)

Progress Bars¶

TQDM¶

Simple progress bar via pip install tqdm

from tqdm import tqdm
import time
for i in tqdm(range(100)):
    time.sleep(0.1)
    pass

100%|██████████| 100/100 [00:10<00:00,  9.91it/s]

Check Paths¶

Check Path Exists¶

Check if directory exists

import os
directory='new_dir'
print(os.path.exists(directory))

# Magic function to list all folders
!ls -d */

False
ls: cannot access '*/': No such file or directory

Check Path Exists Otherwise Create Folder¶

Check if directory exists, otherwise make folder

if not os.path.exists(directory):
    os.makedirs(directory)

# Magic function to list all folders
!ls -d */

# Remove directory
!rmdir new_dir

new_dir/

Exception Handling¶

Try, Except, Finally: Error¶

This is very handy and often exploited to patch up (save) poorly written code
You can use general exceptions or specific ones like ValueError, KeyboardInterrupt and MemoryError to name a few

value_one = 'a'
value_two = 2

# Try the following line of code
try:
    final_sum = value_one / value_two
    print('Code passed!')
# If the code above fails, code nested under except will be executed
except:
    print('Code failed!')
# This will run no matter whether the nested code in try or except is executed
finally:
    print('Ran code block regardless of error or not.')

Code failed!
Ran code block regardless of error or not.

Try, Except, Finally: No Error¶

There won't be errors because you can divide 4 with 2

value_one = 4
value_two = 2

# Try the following line of code
try:
    final_sum = value_one / value_two
    print('Code passed!')
# If the code above fails, code nested under except will be executed
except:
    print('Code failed!')
# This will run no matter whether the nested code in try or except is executed
finally:
    print('Ran code block regardless of error or not.')

Code passed!
Ran code block regardless of error or not.

Assertion¶

This comes in handy when you want to enforce strict requirmenets of a certain value, shape, value type, or others

for i in range(10):
    assert i <= 5, 'Value is more than 5, rejected'
    print(f'Passed assertion for value {i}')

Passed assertion for value 0
Passed assertion for value 1
Passed assertion for value 2
Passed assertion for value 3
Passed assertion for value 4
Passed assertion for value 5



---------------------------------------------------------------------------

AssertionError                            Traceback (most recent call last)

<ipython-input-2-d9d077e139a9> in <module>
      1 for i in range(10):
----> 2     assert i <= 5, 'Value is more than 5, rejected'
      3     print(f'Passed assertion for value {i}')


AssertionError: Value is more than 5, rejected

Asynchronous¶

Concurrency, Parallelism, Asynchronous¶

Concurrency (single CPU core): multiple threads on a single core running in sequence, only 1 thread is making progress at any point
- Think of 1 human, packing a box then wrapping the box
Parallelism (mutliple GPU cores): multiple threads on multiple cores running in parallel, multiple threads can be making progress
- Think of 2 humans, one packing a box, another wrapping the box
Asynchronous: concurrency but with a more dynamic system that moves amongst threads more efficiently rather than waiting for a task to finish then moving to the next task
- Python's asyncio allows us to code asynchronously
- Benefits:
  - Scales better if you need to wait on a lot of processes
    - Less memory (easier in this sense) to wait on thousands of co-routines than running on thousands of threads
  - Good for IO bound uses like reading/saving from databases while subsequently running other computation
  - Easier management than multi-thread processing like in parallel programming
    - In the sense that everything operates sequentially in the same memory space

Asynchronous Key Components¶

The three main parts are (1) coroutines and subroutines, (2) event loops, and (3) future.
- Co-routine and subroutines
  - Subroutine: the usual function
  - Coroutine: this allows us to maintain states with memory of where things stopped so we can swap amongst subroutines
    - async declares a function as a coroutine
    - await to call a coroutine
- Event loops
- Future

Synchronous 2 Function Calls¶

import timeit
def add_numbers(num_1, num_2):
    print('Adding')
    time.sleep(1)
    return num_1 + num_2

def display_sum(num_1, num_2):
    total_sum = add_numbers(num_1, num_2)
    print(f'Total sum {total_sum}')

def main():
    display_sum(2, 2)
    display_sum(2, 2)

start = timeit.default_timer()

main()

end = timeit.default_timer()
total_time = end - start

print(f'Total time {total_time:.2f}s')

Adding
Total sum 4
Adding
Total sum 4
Total time 2.00s

Parallel 2 Function Calls¶

from multiprocessing import Pool
from functools import partial

start = timeit.default_timer()

pool = Pool()
result = pool.map(partial(display_sum, num_2=2), [2, 2]) 

end = timeit.default_timer()
total_time = end - start

print(f'Total time {total_time:.2f}s')

Adding
Adding
Total sum 4
Total sum 4
Total time 1.08s

Asynchronous 2 Function Calls¶

For this use case, it'll take half the time compared to a synchronous application and slightly faster than parallel application (although not always true for parallel except in this case)

import asyncio
import timeit
import time

async def add_numbers(num_1, num_2):
    print('Adding')
    await asyncio.sleep(1)
    return num_1 + num_2 

async def display_sum(num_1, num_2):
    total_sum = await add_numbers(num_1, num_2)
    print(f'Total sum {total_sum}')

async def main():
    # .gather allows us to group subroutines
    await asyncio.gather(display_sum(2, 2), 
                         display_sum(2, 2))

start = timeit.default_timer()

# For .ipynb, event loop already done
await main()

# For .py
# asyncio.run(main())

end = timeit.default_timer()
total_time = end - start

print(f'Total time {total_time:.4f}s')

Adding
Adding
Total sum 4
Total sum 4
Total time 1.0021s

Python¶

Lists¶

Creating List: Manual Fill¶

Creating List: List Comprehension¶

Joining List with Blanks¶

Joining List with Comma¶

Checking Lists Equal: Method 1¶

Checking Lists Equal: Method 2¶

Sets¶

Removing Duplicate from List¶

Lambda, map, filter, reduce, partial¶

Lambda¶

Add Function¶

Multiply Function¶

Map¶

Create List¶

Map Square Function to List¶

Create Multiple List¶

Map Add Function to Multiple Lists¶

Filter¶

Create List¶

Filter multiples of 3¶

Reduce¶

Partial¶

Generators¶

Simple custom generator function example: sum 1 to 1,000,000¶

Without generator function: sum with list¶

Without generator function: sum with for loop¶

Generator expression¶

Decorators¶

First Class Objects¶

*args¶

**kwargs¶

Decorators as Logger and Debugging¶

Dates¶

Get Current Date¶

Get Clean String Current Date¶

Count Business Days¶

Progress Bars¶

TQDM¶

Check Paths¶

Check Path Exists¶

Check Path Exists Otherwise Create Folder¶

Exception Handling¶

Try, Except, Finally: Error¶

Try, Except, Finally: No Error¶

Assertion¶

Asynchronous¶

Concurrency, Parallelism, Asynchronous¶

Asynchronous Key Components¶

Synchronous 2 Function Calls¶

Parallel 2 Function Calls¶

Asynchronous 2 Function Calls¶

Comments