# Introduction to Scientific Python #

ICTP Summer School on Quantum Dynamics: From Electrons to Qbits

Date: August 29, 2022

Lecturer: Chris Laumann

This short crash course draws heavily on a number of great resources from around the web. 
To really learn Python, it is best to spend a weekend and work through some of the many tutorials. 
Some good resources:

 - [The Python Tutorial](https://docs.python.org/3.9/tutorial/)
 - [Python Scientific Lecture Notes](http://scipy-lectures.github.io/index.html) 
 - [A Crash Course in Python for Scientists](http://nbviewer.ipython.org/gist/rpmuller/5920182)
 - [NumPy: the absolute basics for beginners](https://numpy.org/devdocs/user/absolute_beginners.html)
  
Google is your friend. There's lots of documentation for Python and its many packages. 
I've included links to the main websites below but there's many other sources of information.

# Why Python? #

### Simple, well-structured, general-purpose language
  - Readability great for quality control and collaboration
  - Code how you think: many books use python as pseudocode
  
### High-level 
  - Rapid development
  - Do complicated things in few lines

### Interactive 
  - Rapid development and exploration
  - No need to compile, run, debug, revise, compile
  - Data collection, generation, analysis and publication plotting in one place

### Speed
  - Usually plenty fast -- will discuss more
  - Your development time is more important than CPU time
  - Not as fast as C, C++, Fortran but these can be easily woven in where necessary

### Vibrant community
  - Great online documentation / help available
  - Open source
  
### Rich scientific computing libraries
  - Don't reinvent the wheel!


# Scientific Python Key Components #

The core pieces of the scientific Python platform are:

**[Python](http://www.python.org)**, the language interpreter 
  - Many standard data types, libraries, etc
  - Python 3 is the current version; use this

**[Jupyter](http://www.jupyter.org)**: notebook based (in browser) interface
  - Interactive manipulation of plots
  - Easy to use basic parallelization
  - Lots of useful extra bells and whistles for Python
  
**[Numpy](http://www.numpy.org)**, powerful numerical array objects, and routines to manipulate them. 
  - Work horse for scientific computing
  - Basic linear algebra (np.linalg)
  - Random numbers (np.random)
  
**[Scipy](http://www.scipy.org)**, high-level data processing routines. 
  - Signal processing (scipy.signal)
  - Optimization (scipy.optimize)
  - Special functions (scipy.special)
  - Sparse matrices and linear algebra (scipy.sparse, scipy.linalg)

**[Matplotlib](http://www.matplotlib.org)**, plotting and visualization
  - 2-D and basic 3-D interactive visualization
  - “Publication-ready” plots
  - LaTeX labels/annotations automagically

**[Pandas](https://pandas.pydata.org)**, data analysis/management
  - data structures for relational data manipulation
  - useful in observational/numerical analysis
  - Won't discuss here

**[Mayavi](http://code.enthought.com/projects/mayavi/)**, 3-D visualization
  - For more sophisticated 3-D needs (won't discuss)

# Jupyter Workflow #

### Two primary workflows:

1. Work in a Jupyter/IPython notebook. Write code in cells, analyze, plot, etc. Everything stored in **.ipynb** file.
2. Write code in **.py** files using a text editor and run those within the IPython notebook or from the shell.

We will stick to the first. 

While you are using a notebook, there is a **kernel** running which actually executes your commands and stores your variables, etc. If you quit/restart the kernel, all variables will be forgotten and you will need to re-execute the commands that set them up. This can be useful if you want to reset things. The input and output that is visible in the notebook is saved in the notebook file.

*Note:* .py files are called **scripts** if they consist primarily of a sequence of commands to be run and **modules** if they consist primarily of function definitions for import into other scripts/notebooks. 

### Notebook Usage

Two modes: editing and command mode.

Press escape to go to command mode.
Press return to go into editing mode on selected cell.

In command mode:
1. Press a or b to create a new cell above or below the current.
2. Press m or y to convert the current cell to markdown or code.
3. Press shift-enter to execute.
4. Press d d to delete the current cell. (Careful!)

In editing mode:
1. Press tab for autocomplete
2. Press shift-tab for help on current object
3. Shift-enter to execute current cell

Two types of cells:
1. Markdown for notes (like this)
2. Code for things to execute


### Exercise ###

Try editing this markdown block to make it more interesting.

### Exercise

Execute the next block and then create a new block, type x. and press tab and shift-tab.

In [None]:
x = 10

### Exercise

Run this stuff.

In [None]:
print('Hello, world!')

In [None]:
"Hello, world!"

In [None]:
2.5 * 3

In [None]:
3**3

In [None]:
3 + 3

In [None]:
"ab" + "cd"

In [None]:
"Hello" == 'Hello'

# Variables and Objects #

Everything in memory in Python is an object. Every object has a type such as int (for integer), str (for strings) or ndarray (for numpy arrays). Variables can reference objects of any type and that type can change.

The equals sign in programming does not mean 'is equal to' as in math. It means **'assign the object on the right to the variable on the left'**.


In [None]:
a = 3

In [None]:
a

In [None]:
type(a)

In [None]:
a+a

In [None]:
2+a

In [None]:
# Objects have properties and methods, accessible with a .
# (press tab to list 'em)
a.real

In [None]:
a = "Hello, world!"

In [None]:
a

In [None]:
type(a)

In [None]:
a+a

In [None]:
2+a

### Exercise

Create a string of 25 "x"'s and assign it to a new variable b.
(Hint: what might * do between strings and integers?)

### Overloading 

Operators and functions will try to execute no matter what type of objects are passed to them, but they may do different things depending on the type. + adds numbers and concatenates strings.

### Variables as References ###

All variables are **references** to the objects they contain. Assignment does not make copies of objects.

In [None]:
a = [1,2]
a

In [None]:
b = a
b

In [None]:
b[0] = 0
b

In [None]:
a

In [None]:
b = [3,4]
b

In [None]:
a

# Types of Objects #

## Basic Types ##

1. **Numeric**
  * Integer: -1, 0, 1, 2, ...
  * Float: 1.2, 1e8
  * Complex: 1j, 1. + 2.j
  * Boolean: True, False
2. Strings, "hi", Immutable
3. Tuples, (2,7, "hi"), Immutable
  - Ordered collection of other objects, represented by parentheses
  - can't change after creation
3. Lists, [0,1,2,"hi", 4], Mutable
  - Ordered collection of other objects, represented by square brackets
  - can add/remove/change elements after creation (*mutable*)
4. Dictionaries, {'hi': 3, 4: 7, 'key': 'value'}
5. Functions, def func()

## Common Scientific Types ##

6. NumPy arrays, array([1,2,3])
  - Like lists but all entries have same type
7. Sparse arrays, scipy.sparse
8. Pandas DataFrames, high level 'table' a bit like an excel spreadsheet

# Basic Types: Numeric #

There are 4 numeric types: 
- int: positive or negative integer
- float: a 'floating point' number is a real number like 3.1415 with a finite precision
- complex: has real and imaginary part, each of which is a float
- bool: two 'Boolean' values, True or False


In [None]:
a = 4
type(a)

In [None]:
c = 4.
type(c)

In [None]:
a = 1.5 + 0.1j
type(a)

In [None]:
a.real

In [None]:
a.imag

In [None]:
flag = (3>4)
flag

In [None]:
type(flag)

In [None]:
type(True)

In [None]:
# Type conversion
float(1)

### Careful with integer division!

In Python, dividing integers promotes to a float. Use // for integer division.

In [None]:
3/2

In [None]:
3/2.

***Force integer division:***

In [None]:
3//2

# Basic Types: Strings #

Strings are sequences of characters. They are **immutable**, which means you can't change a character in the middle of a string after you've created it. 

Literal strings can be written with single or double-quotes. Multi-line strings with triple quotes. 'Raw' strings are useful for embedding LaTeX because they treat backslashes differently.

In [None]:
'Hello' == "Hello"

In [None]:
a = """This is a multiline string.
Nifty, huh?"""

In [None]:
a

In [None]:
print("\nu")

In [None]:
# the r" makes this a raw string
print(r"\nu")

In [None]:
a = 3.1415

In [None]:
# Simple formatting (type convert to string)
"Blah " + str(a)

In [None]:
# Old style string formatting (ala sprintf in C)
"Blah %1.2f, %s" % (a, "hi")

In [None]:
# New style string formatting 
"Blah {:1.2f}, {}".format(a, "hi")

# Basic Types: Lists #

Python lists store **ordered** collections of arbitrary objects. They are efficient maps **from index to values**. Lists are represented by square brackets [ ]. 

Lists are **mutable**: their contents can be changed after they are created.

It takes time O(1) to:
1. Lookup an entry at given index.
2. Change an item at a given index.
3. Append or remove (pop) from the end of the list. 

It takes time O(N) to:
1. Find items by value if you don't know where they are.
2. Remove items from near the beginning of the list.

You can also grab arbitrary **slices** from a list efficiently.

Lists are 0-indexed. This means that the first item in the list is at position 0 and the
last item is at position N-1 where N is the length of the list.

In [None]:
days_of_the_week = ["Sunday","Monday","Tuesday",
                    "Wednesday","Thursday","Friday"]

In [None]:
days_of_the_week[0]

In [None]:
# The slice from 2 to 5 (inclusive bottom, exclusive top)
days_of_the_week[2:5]

In [None]:
days_of_the_week[-1]

In [None]:
# every other day
days_of_the_week[0:-1:2]

In [None]:
# every other day (shorter)
days_of_the_week[::2]

In [None]:
# Oops!
days_of_the_week.append("Saturday")

In [None]:
days_of_the_week[-1]

In [None]:
days_of_the_week[5] = "Casual Friday"

In [None]:
days_of_the_week

In [None]:
# Get the length of the list
len(days_of_the_week)

In [None]:
# Sort the list in place
days_of_the_week.sort()

In [None]:
days_of_the_week

**Remember tab completion** Every thing in Python (even the number 10) is an object. Objects can have methods which can be accessed by the notation a.method(). Typing a. and pressing tab allows you to see what methods an object a supports. Try it now with days_of_the_week:


In [None]:
days_of_the_week.

**Each item is arbitrary**: You can have lists of lists or lists of different types of objects.

In [None]:
aList = ["zero", 1, "two", 3., 4.+0j]
aList

In [None]:
listOfLists = [[1,2], [3,4], [5,6,7], 'Hi']

In [None]:
listOfLists[2][1]

# Basic Types: Dictionaries #

A dictionary is an efficient map **from keys to values**. They are represented by curly brackets {}. 

Dictionaries are **mutable** but all **keys must be immutable**. IE. keys can be strings, numbers, or tuples thereof but not lists or other dictionaries. Values can be anything.

It is unordered but takes time O(1) to:
1. Lookup a value from a key
2. Add a key, value pair
3. Remove a key, value pair

It takes time O(N) to find an entry with a particular value.

You can iterate through all the entries efficiently O(N).

In [None]:
tel = {'emmanuelle': 5752, 'sebastian': 5578}
tel['francis'] = 5915

In [None]:
tel

In [None]:
tel['sebastian']

In [None]:
tel.keys()

In [None]:
tel.values()

In [None]:
len(tel)

In [None]:
'francis' in tel

In [None]:
del tel['francis']

In [None]:
tel

# Basic Types: Tuples #

A tuple is an **ordered** collection of objects. They are represented by round parantheses ().

Tuples are almost like lists but they are **immutable**. This means that they cannot be changed once they are created. 

In [None]:
t = (1,2,3,'Hi')
t

In [None]:
# IMMUTABLE !
t[0] = 2

The empty tuple and length 1 tuples have special notation since parentheses can also represent grouping.

In [None]:
emptyTuple = ()
emptyTuple

In [None]:
lengthOne = ('hi',)
lengthOne

In [None]:
notLengthOne = ('hi')
notLengthOne

In [None]:
notLengthOne[0]

# Control Flow #

The flow of a program is the order in which the computer executes the statements in the code. Typically, this is in order from top to bottom. However, there are many cases where we want to change the flow in some way. For example, we might want to divide two numbers but only if the divisor is not zero. Or we might want to iterate: repeat a block of code many times for each value in some list. The commands which allow these are called control flow commands.

**WARNING**: Python cares about **white space**! You must **INDENT CORRECTLY** because that's how Python knows when a block of code ends. 

Typically, people indent with 4 spaces per block but 2 spaces or tabs are okay. They must be consistent in any block.

## If/elif/else

In [None]:
if 2>3:
    print("Yep")
    print("It is")

elif 3>4:
    print("Not this one either.")
    
else:
    print("Not")
    print("At all")

### Exercise

Write a cell which checks that a is not zero and prints 1/a if it is. Then, create another cell to set a to different values and check your if cell works.

### For Loops ###

For loops **iterate** through elements in a collection. This can be a list, tuple, dictionary, array or any other such collection. 

You should read a for loop like this:

For each element in the collection, set the index variable to refer to the element and execute the **block** of code after the colon.

In [None]:
days_of_the_week = ["Sun","Mon","Tues",
                    "Wednes","Thurs","Fri", 
                    "Satur"]

In [None]:
for day in days_of_the_week:
    print("Today is " + day)

### Exercise
Write a for loop which adds the suffix 'day' to each of the days of the week in the list above and prints it.

### Exercise

Write a loop which adds the suffix 'day' to each of the days of the week in the list above and prints it but only if the day begins with 'T'.

As you often want to iterate over blocks of consecutive integers, the **range** function provides a convenient way to specify such a block.

In [None]:
for i in range(5):
    j = i**3
    print("The cube of " + str(i) + " is " + str(j))

In [None]:
for i in range(3,7):
    j = i**3
    print("The cube of " + str(i) + " is " + str(j))    

In [None]:
range?

Use **enumerate** to get index and value of iteration element.

In [None]:
words = ('your', 'face', 'is', 'beautiful')

for (i, word) in enumerate(words):
    print(i, word)

### Exercise

Write a for loop which adds the suffix 'day' to each of the days of the week in the days_of_the_week list and **replaces it in the list**. This is tricky because you need to know the position in the list that you want to replace. Use enumerate.

In [None]:
days_of_the_week

In [None]:
for i in range(5):
    j = i**3
    print("The cube of " + str(i) + " is " + str(j))

In [None]:
for day in days_of_the_week:
    print("Today is " + day)

In [None]:
for key in tel:
    print(key + "'s telephone number is " + str(tel[key]))

**Enumerate** to get index and value of iteration element

In [None]:
words = ('your', 'face', 'is', 'beautiful')

for (i, word) in enumerate(words):
    print(i, word)

## While Loops

Repeats a block of code while a condition holds true. This is more bug prone than iterating so usually best to use for.

In [None]:
x = 5

while x > 0:
    print("Bark " + str(x))
    x -= 1

# Functions #

Any code that you call multiple times with different values should be wrapped up in a function. For example:

In [None]:
def square(x):
    """Return the square of x."""
    return x*x

In [None]:
square?

In [None]:
square(9)

### Exercise

Write a function squareIfEven(x) which returns x if x is odd and x squared if x is even. Test your function works by calling it on different values.

### Functions are Objects ###

Functions are just like any object in Python:

In [None]:
type(square)

Make another variable refer to the same function:

In [None]:
a = square

In [None]:
a(5)

A function being passed to another function.

In [None]:
def test():
    print("In Test!")
    return

def callIt(fun):
    print("In callIt!")
    fun()
    return

In [None]:
callIt(test)

## Recursion

A function is called recursive if it depends on calling itself for a smaller argument. A classic example of a recursive definition is given by the Fibonacci sequence:
\begin{align}
    F_n &= F_{n-1} + F_{n-2} \\
    F_0 &= F_1 = 1
\end{align}
The first few of the Fibonacci numbers are $1,1,2,3,5,8,11,\cdots$. For more information about the Fibonacci numbers, check out the Sloan [Online Encyclopedia of Integer Sequences](http://oeis.org), which is an amazing resource.


### Exercise

Complete the implementation of a recursive function to compute the $n$'th Fibonacci number and compute the first few.

In [None]:
def fibonacci(n):
    """Return the n'th fibonacci number."""
    
    # base case -- always implement this first
    if n == 0 or n == 1:
        return # PUT SOMETHING HERE

    # recursive case
    fibn = # PUT SOMETHING HERE
    
    return fibn

### Exercise

The naive recursive implementation of fibonacci scales very badly. That is, it uses a lot of operations (how many calls to fibonacci?) to evaluate the $n$'th number. 

A better algorithm should be able to compute the $n$'th number using $O(n)$ operations, just like we could using a piece of a paper and a pencil to start writing the sequence in order:

    1 + 1 = 2
    1 + 2 = 3
    2 + 3 = 5
    ...

Can you write a function to compute the $n$'th Fibonacci number using only $O(n)$ operations?

In [None]:
def fibonacci_fast(n):
    """Return the n'th fibonacci number."""
    
    # DO SOMETHING
    
    return fibn