We will assume you know how to launch Python and how to use a text editor (like vim or emacs), or even an integrated development environment (IDE), to save code into a file. While some of the details vary based on which operating system (OS) you employ, most of what we have to say is OS-agnostic.
You can run Python code interactively: once you’ve launched a Python shell you get the Python prompt >>> (also known as a chevron). Since this tutorial is in the form of a Jupyter Notebook, the Python prompt will not explicitly appear below.
Here are a few examples of things you could type in. You should press Enter after each line you input at the Python interpreter (or Shift-Enter if you're working on the Notebook).
2+3
2*3
x=42
print(x)
This is similar to other interactive environments that you may have seen before, like Mathematica. It is sometimes called Python’s read-evaluate-print loop (REPL). Note that the first two lines show us we could (if we wanted to) use Python as a calculator. The next line provided us with our first example of using a variable: in this case we created a variable x and assigned the value 42 to it. (Numbers like 3 or 42 are known as literals). We then used the print() function to print out the variable’s value to the screen. This is our first use of a function: as in mathematics, we use parentheses to show that we are passing in an argument. We won’t go into the different versions of Python at this point, assuming for now that you are using Python 3, where the above example is legitimate. (We discuss Python 2 vs Python 3 in a section near the end of this tutorial).
You don’t need to use Python interactively. Like other programming languages, the most common way of writing and running programs is to store the code in a file. You can do this for all 4 lines in the example above. In our case, we get:
%%writefile example.py
2+3
2*3
x=42
print(x)
Note that we didn’t include the Python prompt, as that only shows up when running Python interactively. When you run this Python program, the output printed on the screen will be 42. To see this, on a Unix-like system you would type python example.py at a terminal. It’s worth observing that the first 2 lines in example.py are highly uncommon for scripts/programs stored in files: they carry out a calculation but don’t assign the result to a variable or print it, so the result is immediately lost. Things are different when using Python interactively, as the answers are printed out to the screen in that case, even if you don’t explicitly use the print() function.
There also exists a useful combination between interactive and script modes. Assume you’re at the Python prompt, as usual. If you make sure you’re in the same directory as the example.py file and type:
import example
(note that this is example without the .py at the end) then you get to access all the functionality contained in your Python program file, while still trying things out interactively. Our example is, of course, near trivial, since the functionality introduced by the example module is limited – we will come back to importing later on.
For the sake of completeness, we note that a powerful way of fusing interactive sessions and scripts consists of using Jupyter notebooks. These basically allow you to use Python inside a web browser. One of their main advantages is that they allow you to save an interactive session. Another advantage is the ability to combine code, graphics, and notes, all in one place. Even so, in what follows we will restrict ourselves to plain Python; it’s up to you whether you wish to use a text editor, an IDE, or a Jupyter notebook.
There are some rules governing variables like x in the example above:
Variable names are made up of letters or numbers or the character _ (an underscore). Thus, how_do_you_do is an allowed variable name, but not.bad isn’t.
Variable names cannot start with a number, so se7en is an allowed variable name, but 11ven isn’t.
Variable names are case sensitive, meaning that x and X are different variables.
You cannot use reserved words (also known as keywords) like for, if, etc as variable names. You will soon see more examples of such reserved words.
We will not provide a comprehensive set of coding guidelines, but when naming variables it’s good to keep in mind that “shorter is better”. For example, therm is better than thermodynamic_property. On the other hand, also keep in mind that “meaningful is good”: for example, therm is typically a better name than t. That being said, if you are dealing with the time parameter, it’s probably wiser to name it t rather than time.
Variables come in different types. The x in example.py above was an integer variable. Different types serve distinct purposes and are associated with different costs (e.g., in terms of storage). You should use the appropriate type of variable in each case. Here are some examples of built-in types:
The types integer, float, and complex are known as numeric types, for obvious reasons.
Python is a dynamically typed language, meaning that runtime objects (like the variable x above) get their type from their value. As a result, you can use the same variable to hold different types in the same interactive session (or in the same program). For example:
x=1
type(x)
x=3.5
type(x)
x="Hi"
type(x)
Here we used the built-in type(), which also marks our second use of a function (the first one was print()). The term built-in refers to the fact that this is functionality that we get “out-of-the-box”, meaning without having to import it from somewhere else.
Another feature of this dynamic typing is the fact that (unlike other languages you may be familiar with, like C or Fortran – these are statically typed languages), you do not first declare a variable’s type and later give it a value. In Python, the assignment is when the type of the variable gets determined.
You can convert among types, for example:
x=1
type(x)
y = float(x)
type(y)
where we used the Python built-in function float(). As you can imagine, there also exist Python built-in functions called int(), complex(), bool(), str() (with meanings that are, mostly, straightforward). For example:
spi = "3.14"
type(spi)
fpi = float(spi)
type(fpi)
This spi is clearly not a number, whereas fpi is. It’s worth noting, however, that you wouldn’t know the difference if all you were doing was printing out their values:
spi = "3.14"
print(spi)
fpi = float(spi)
print(fpi)
Note that Python is smart enough to know how to translate a string like "3.14" to a float. We observe that operations like 2*spi and 2*fpi give very different results. (Try this!) Similarly, 2.*spi and 2.*fpi are even more different from each other (the former doesn’t even work).
Note that horizontal spaces don’t really matter, i.e.:
x = 1
is equivalent to:
x = 1
though the latter is, obviously, aesthetically displeasing. For future reference, keep in mind that leading spaces are syntactically significant (i.e., we couldn’t have put a space before x). As you may have already heard, in Python indentation matters.
Note, finally, that blank lines between statements in a program file are OK. For example, this file:
%%writefile example2.py
2+3
2*3
x=42
print(x)
is fully equivalent to the file example.py given above. Extra blank lines often improve readability (although, as the example of example2.py shows, this isn’t always the case).
We’ve already discussed how to print out a variable, thereby seeing its value. This is easy to generalize to the case of more variables:
x=1
y=2
print(x, y)
We simply passed in the two variables, separating them with a comma. We can also intersperse numeric literals:
x=1
y=2
print(x, 5, y)
or even string literals:
x=1
y=2
print(x, "is not", y)
Note that in all these cases we are comma-separating the entities we’re passing in, whereas when they are printed to screen they are space-separated (Try this!). There exist much fancier ways of formatting output, most notably via the use of format(). We will provide a basic introduction in a later section.
Getting input from the user is very easy. We simply use the input function called. . . input() and can freely manipulate the result after that:
x = input("Enter an integer: ")
print("Twice that is:", 2*int(x))
note that, in Python 3, no matter what you input, x is saved as a string (i.e., even if the user typed 75). As you were asked to check earlier on, 2*x behaves in a possibly unexpected manner if x is a string. This is why we are printing out 2*int(x): we are first converting the string to an integer using the built-in function int().
Of course, this wouldn’t have been quite right if the user had typed in, say, a float. In that case, we would have had to change the code line printing out to:
print("Twice that is:", 2*float(x))
Clearly, this can get cumbersome. Another approach would be to use the built-in function eval() which evaluates its argument as a Python expression (and therefore does the conversion to different numeric types on its own). However, using eval() can be dangerous if you do not trust the source of the input.
We note that, in production code, prompting the user for input is rather rare: the input parameters are either set in the code itself or read from an input file. As we’ll see near the end of this tutorial, there exist ways of doing file output and input in Python. For now, let us become a little more comfortable with using Python.
We’ve already seen trivial examples of addition and multiplication, in Python, above. These can get even more interesting: we can assign the result of an operation to another variable, for example:
x=1
y=2
w = x + y
print(w)
and can even tell print() directly to evaluate and print out the result, without the use of an intermediate variable:
print(x+y)
You have to exercise your judgment to decide when to use a new variable (typically to store the result of a complicated calculation) and when not to (when you can simply do the calculation on the spot, as above).
More generally, for two variables x and y we can have the following operations (among others, which we won’t use):
When mixing different types in a numerical operation, we get promotion, e.g.
x=1
y=2.0
z=x+y
type(z)
As a mnemonic, you can think that floats are “more general” than integers, so the result becomes a float. Here’s a related question for you: what do you get when you multiply/add/etc an integer and a complex number? (Try this!)
The operations mentioned above follow well-defined rules of precedence. For example, the following line, which helps us convert temperatures from Celsius to Fahrenheit:
Tc = 27
Tf = Tc*9/5 + 32
is interpreted by taking multiplication and division as “more important” than addition or subtraction, and is therefore equivalent to:
Tf = (Tc*9/5) + 32
Similarly, powers are more important than multiplications/divisions or additions/subtractions. You should generally use parentheses when you want to be clear or when you want to force a different result than Python would assume by default. For example, the following:
Tc = (Tf-32)*5/9
carries out the conversion in the opposite direction. Note that without the parentheses the multiplication and division would have been carried out before the subtraction (so this would have been an erroneous implementation of the conversion from degrees Fahrenheit to degrees Celsius).
Note that you always need a single variable on the left-hand side of an assignment (disclaimer: keep reading). For example:
y=2
3*x = y
is illegal. Depending on your intentions, could say either
x = y/3
or
y = 3*x
Always remember that Python (like other programming languages) knows only how to plug in known values on the right-hand-side of an assignment, thereby producing a new value. This value is then labelled by the single variable name which appears on the left-hand side. In general, you may find it useful to think of Python variables as labels/tags/names. This is quite different from the mnemonic that is helpful for statically typed languages: for those languages, it’s sometimes helpful to think of a variable as being a “box”, since a variable can still exist even if no value has been assigned to it (an empty box). In contradistinction to this, it might help to think of Python values as being produced first and labels being attached to them after that.
Let’s see some (elementary) examples of assignments. The following:
x = 7
x = x + 1
is perfectly reasonable, though possibly disturbing if you’ve never programmed before. The first line gives x the value 7. The next line plugs that value (i.e., 7) on the right- hand-side, increments it by one, and then assigns the result to the variable x. (That is, x will have the value 8 after these two lines).
This idiom, of having a variable both on the left-hand side (LHS) and the right- hand side (RHS), incremented by something, is so common that Python also provides an augmented assignment:
x += 1
which is fully equivalent to x = x+1. Note that we don’t always have to increment by one. For example, x += 4 increments by 4, meaning it’s equivalent to x = x + 4. This type of augmented assignment also exists for subtraction, multiplication, and division: x -= 4, x *= 4, and x /= 4, respectively.
Now that you know that having a variable appear on both the LHS and the RHS is allowed (its value being understood on the RHS), you will be able to grasp that:
x = x**2 + 4*x - 7
is not a quadratic equation to be solved, but rather a simple assignment.
We noted above, for pedagogical reasons, that you always need a single variable on the left-hand side of an assignment. This is to be interpreted in the sense that you can’t put operations on like 2*x on the LHS. However, Python provides multiple assignment, which allows you to put comma separated variable names on the LHS, e.g.:
x, y, z = 1, 3.4, "Hello"
As you just saw, the variables can be of different types. This is merely shorthand for the more verbose:
x = 1
y = 3.4
z = "Hello"
We can combine this feature with the afore-mentioned ability to use the same variable on both the LHS and the RHS to write:
x, y = 2*x + 1, 3*y - 5
which is merely shorthand for:
x = 2*x + 1
y = 3*y - 5
Note that having two assignments on the same line could be done in another way, if we simply expand our notion of “line”: we can use semicolons to separate two simple statements and thereby still put them both on the same line:
x = 2*x + 1; y = 3*y - 5
which can be convenient if you’re pressed for space, but may also hide a bug if you’re not careful.
Python’s multiple assignment provides us with a nifty way to swap two variables (i.e., assign each one’s value to the other variable):
x, y = y, x
In many other languages, accomplishing the same task would require the use of a temporary (throwaway) variable:
z = x
x = y
y = z
In multiple assignment, Python first evaluates the right-hand side, obviously using existing values, and then assigns to the variables listed on the left-hand side. If you understand this, then you will also understand what the output of the following is:
x = 3; y = 7
x, y = 2*y, 5*x
Comments are an important feature of programming languages: they are text that is ignored by the computer but can be very helpful to humans reading the code. That human may be yourself in a few months, at which point you may have forgotten the purpose or details of the code you’re inspecting.
It’s generally a good idea to put comments at the start of a code block, for example:
# Initializing variables
a = 17
b = 9
c = 32
where we pressed Enter after variables and got the . . . in response, which was Python’s way of saying that it’s ready for a “real” statement (since the comment is ignored). Something similar holds for the case where this code is in a file:
%%writefile initialize.py
# Initializing variables
a = 17
b = 9
c = 32
Typically, comments are encountered in scripts (i.e., in program files) rather than when using Python interactively.
You can also put comments next to specific statements, if you’re documenting that specific behavior:
m = 1.0 # set initial mass; changed below
It’s generally bad practice to put comments that don’t add any value, e.g.:
i = 0
i += 1 # increment i by 1
This simply adds to our cognitive load without providing any further insights.
Note, finally, that in Python there exists a more general category of comments, known as docstrings (short for “documentation strings”). These are typically the first statement in a code entity (function, module, etc). They use triple quotation marks, as in:
%%writefile initialize2.py
"""Initialize variables"""
a = 17
b = 9
c = 32
and are convenient in that they can span multiple lines.
%%writefile initialize3.py
"""
Initialize variables
Nothing too exciting here
"""
a = 17
b = 9
c = 32
As a general rule, docstrings give a big-picture overview on how to use the code, whereas regular comments provide details on why the code does certain things, so are mainly helpful in order to maintain the code. When you update the code, you should always make sure you check to see if the comments still remain true. Mismatched code and comments can lead to wasting time, since it’s not clear if the code is wrong, the comments are wrong, or both are wrong.
You should generally aim to write good code: you will then need few comments. Instead of documenting bad code you should replace it with good code. Of course, it takes some experience to know what constitutes “good code”.
Note that, despite our admonitions above, we generally don’t include explanatory comments in our code examples: this is because the text itself serves that role. In other words, since our code comments and textual explanations would repeat the same thing, we try to avoid duplication. That being said, in your own codes (which are not embedded in a book or in a tutorial which is discussing them) you should always include comments.
We have already seen that Python provides several built-in functions, like type(), input(), float(), and so on. There is only a small number of these. Another very useful built-in function is abs(), which gives the absolute value of an integer or float (and the modulus of a complex number). For example:
abs(-5)
Here’s a slightly more involved example:
abs(2 + 3j)
You might decide to check this explicitly. A reasonable guess would be that sqrt() would give the square root, so you try:
sqrt(2**2 + 3**2)
so our guess was wrong. (Of course, in this case you could have simply said (2**2 + 3**2)**0.5). It turns out that Python contains added functionality in the form of modules, which you can think of as extra Python program files which expose their functions and (global) variables.
For the square root specifically, our example above works well if we first import the sqrt() function that is to be found in the math module:
from math import sqrt
sqrt(2**2 + 3**2)
which we are pleased to see agrees with the result of abs() above. In addition to other mathematical functions (like cos(), log(), exp(), etc), the math module also contains a few constants, most notably:
from math import pi
pi
We can import more than one entity at the same time, simply by comma separating:
from math import sin, exp
exp(2)
As you can easily check, exp(2) calculates $e^2$. More generally, you can, if you wish, import everything from the math module by using * (with which you may already be familiar from its use as a wildcard in Unix):
from math import *
though you should probably avoid doing that for long programs. This is because you’re polluting the namespace with many new variables/functions: you can easily imagine importing this way from several different modules, in which case it would be difficult to keep track of which variable/function comes from where.
A distinct way of importing module content is to simply say:
import math
In this case, you need to use a dot after math to refer to the entity you’re trying to use, e.g., math.log10().
Remember: if you say from math import pi then you later refer to it as pi, whereas if you say import math then you later refer to the same constant as math.pi. This is the first instance of our using a dot: it’s used to denote membership, i.e., the constant pi is part of the math module, which you’ve brought in in its entirety with import math.
The Python library contains several modules, with names like random, sys, collections, multiprocessing, and so on. You should consult the official documentation for more information.
It is now time to go back to our import example from earlier in the tutorial: as you may recall, this was typed in in the same directory as the example.py file. You can now see the similarity with importing from math or any other standard Python module. If you are in the same directory as initialize.py you can say:
import initialize
print(2*initialize.a)
Given the simplicity of this example, the only things available for further use are the three variables that were defined inside our older program file. There exists yet another way (which we didn’t see above) of importing the functionality contained in a given module:
import initialize as init
print(2*init.a)
This provides a new name for the module, which can be helpful if you are already using the original name for another purpose (i.e., to avoid a name clash) or if the name of the module you’re importing is very long and you’d rather avoid typing it in over and over again. As should be clear by now, we could have, instead, said:
from initialize import a
print(2*a)
or even:
from initialize import *
print(2*a)
Obviously, this type of importing applies to any Python program file we’ve created (even ones containing our own functions – we will learn how to write those shortly).
It goes without saying (but we’re still going to say it) that you can import functionality from one file/module when you are programming inside another file/module (i.e., not only interactively, as in the examples above). Here’s an example:
%%writefile doubling.py
from initialize import a, b, c
print(a,b,c)
print(2*a,2*b,2*c)
The output of running this code is:
%run doubling.py
We also took the opportunity to show that one can import more than one variable on one line (but without having to use *). This simple example clearly shows that we can access the variables a, b, and c from within another module, as long as our file doubling.py is located in the same directory as our file initialize.py.
Finally, you might want to discover the following Easter Egg:
import this
The dir() built-in function can be helpful when interactively inspecting the con- tents of different modules. More generally, dir() returns a list of attributes of a specified object. For example, for the program we stored in the file initialize.py above, we have:
import initialize
dir(initialize)
This gives us a list of all the names available in the given module. We then know what to look for, e.g.:
initialize.a
As you will have undoubtedly noticed, in addition to the a, b, and c variables which are explicitly mentioned in the module initialize, running dir(initialize) also returns some variables which contain leading and trailing double underscores. These are variously called special variables, magic variables, or dunder variables (“dunder” being short for “double under”). Now that you know they exist, you can just as easily inspect them, by saying initialize.__file__.
This is enough to point you in the direction of a common idiom involving __name__, which checks to see if a specific module is being run as the main program or is merely being imported as a module. We explicitly employ this idiom in the book.
As you can imagine, you can use dir() to explore the functionality in more complicated modules, including standard ones, e.g.:
import math
dir(math)
Note that some of these are constants and some are functions.
There are times when we want the code to make a decision according to the value of a specific variable (which we don’t really know until we’ve reached that point in the program). This is accomplished via conditional execution, most famously using the if statement:
x = input("Enter an integer: ")
x = int(x)
if x>0:
print("x is positive")
else:
print("x is negative or 0")
(Observe how we made sure to convert the string produced by input(), while choosing to save the result in a variable of the same name, essentially overwriting the string with an integer). Note that we check the condition x>0 and then take a different action according to the value of x. (Note also the, syntactically important, colons at the end of each decision point). This is the first time we are seeing a significant feature of Python: indentation is important! The line after if and the line after else are indented, reflecting the fact that they belong to the corresponding scenario. This also means that in Python it is trivial to have more than one statement carried out for each possibility (by taking advantage of the indentation):
x = input("Enter an integer: ")
x = int(x)
if x>0:
print("x is positive")
print("and as a matter of fact")
print("its value is", x)
else:
print("x is negative or 0")
print("so it's not positive")
In other programming languages we have to use braces or something else to group statements together. In Python that is accomplished simply via the indentation.
As you may have already noticed, here and in what follows we carry out a minimum of input validation: this means that we don’t check inputs to see if they are malicious (or simply wrong). In other words, we don’t check for the possibility that the user entered a float, a string, and so on.
Python offers yet another possibility: we can check multiple conditions at the same time, using elif:
x = input("Enter an integer: ")
x = int(x)
if x>10:
print("x is positive and large")
elif x>0:
print("x is positive and small")
else:
print("x is negative or 0")
Note that if we had (foolishly) said something like elif x>100 then that branch would have never been executed.
There are several other important checks we can carry out in a conditional expression. For example:
if x==1:
print("message")
checks for equality. Note that we use two equal signs to check for equality: if we had said if x=1 then that wouldn’t have been an equality check but an assignment! Python (unlike many other languages) helpfully gives out a SyntaxError in this case.
We can also check for non-equality:
if x!=1:
print("message")
or to see if we are less than or equal to:
if x<=1:
print("message")
(Obviously, if x>=1 is the corresponding test to see if x is greater than or equal to 1). We can also combine two tests, if we’re interested in the intersection of two possibilities:
if x<10 and x>1:
print("message")
Similarly, the union of two possibilities can be expressed using or:
if x>10 or x<1:
print("message")
There’s even the possibility of negating a truth-value using not:
if not (x<10 and x>1):
print("message")
(For future reference, we point out that that not is most often used in conjunction with the in keyword to check for non-membership of an item in a sequence.) You should note, however, that combining several such conditions can get difficult to parse mentally.
Note, finally, that when checking a boolean variable flag it is a common idiom to write:
flag = False
if flag:
print("message")
instead of
if flag==True:
print("message")
The former is more succint and is considered more elegant.
Conditional expressions were our first example of control flow: not every line of code gets executed in order. We will now see another example of control flow, involving the important concept of a loop, namely the repetition of a code block.
We start from a while loop, which checks to see if a condition is met (similarly to what the if statement did above). If the condition is true, then the following code is executed. However, unlike what happened in the case of the if statement, in the case of a while loop, at the end of executing the code block, control goes back to the line containing the condition, which is checked again, and the body is executed again, and so on, until the condition is no longer true, in which case the body of the block is jumped over and execution resumes from the following (non-indented) line. For example:
i=17
print(i)
while i<50:
i += 5
print(i)
This loop prints out the numbers 22, 27, 32 and so on, up until 52 (since 52 is larger than 50, that’s when the while check fails and control goes to the body of code outside/after the loop, so 57 is never printed).
We sometimes would like to be able to break out of a loop: that means we would like to make sure that if a condition in the middle of the loop body is met, then we will proceed to the first statement after the loop (i.e., even the loop condition won’t be tested one more time). In real-world applications, this scenario may appear when we are going through a directory, opening one file at a time, reading some numbers, manipulating them, and then printing out a result; if a specific file does not exist, it may be reasonable to simply stop carrying out these actions (i.e., we exit the loop completely).
Here’s a straightforward example:
i = 17
print(i)
while i<50:
i += 5
if i%2==0:
print("Even number",i)
break
print(i)
print("Now outside while loop")
This happened because when we came upon the first even number the break was executed and therefore the remaining print(i) (or anything else having to do with the loop) was skipped over.
A variation of this scenario is when we want to skip not the entire loop, but the rest of the loop body for the present iteration. This is accomplished via continue. Turning to the same real-world application as above (reading one file at a time, manipulating some numbers, and then printing out a result): if a specific file does not exist, it may make more sense, instead, to skip the reading-in, manipulation, and output steps for the file that doesn’t exist, but still move on to the next file (assuming that one exists) and carry on the entire sequence of actions.
Here’s an example:
i = 17
print(i)
while i<50:
i += 5
if i%2==0:
print("Even number",i)
continue
print(i)
print("Now outside while loop")
In this case, when we are dealing with an even number the continue makes us skip over the print(i) but then we continue running the loop. When we have an odd number the continue is not encountered, so the print(i) is run.
Finally, we note that an idiomatic way of writing an infinite loop (i.e., one that never ends) is as follows:
i = 5
while True:
i -= 1
print(i)
if i==2:
break
The break is necessary here, since otherwise we would never exit the loop.
Python provides support for a number of container entities, called data structures. In the book, we will mainly be using lists, but here we also go over some features of other data structures, like tuples and dictionaries.
A list is a container of different elements (which can have different types), a container which can grow when you need it to. Here’s an example of creating and assigning a list:
r = [5, 1+2j, -2.0]
A list element can also be another list. Python lists are typically used to group together a number of other variables, e.g.:
x, y, z = 10, 20, 30
rp = [x, y, z]
Note that in both the examples above lists are created using comma-separated values within square brackets. We also use square brackets when we wish to access an individual element (this is called indexing):
print(rp[0])
This prints out the first element in the list rp (which has the value 10). Note that, like the C programming language, Python uses 0-indexing, meaning that the indices go as 0, 1, 2 in this case. In general, indices start at 0 and end at the total number of elements minus 1 (you can trivially check that that leads to the correct total number of elements).
We can use this list indexing to produce other quantities that depend on the list elements, for example:
from math import sqrt
r = [10, 20, 30]
length = sqrt(r[0]**2 + r[1]**2 + r[2]**2)
Note, finally, that we can also access lists starting at their end, using negative indices, for example:
r = [10, 20, 30]
print(r[-1])
Using an index of -1 is the idiomatic way of accessing the last element in a Python list. Similarly, r[-2] is the second-to-last element, and so on.
Lists are mutable sequences, meaning that we can change the value of individual elements, for example:
r = [10, 20, 30]
r[1] = 5
print(r)
Obviously, an immutable sequence type would have led to an error, here, instead of the clean result we got using lists.
Python supports a feature called slicing, which allows us to take a slice out of an existing list. Slicing, like indexing, uses square brackets: the difference is that slicing uses two integers, with a colon in between.
Specifically, if we have a list r, then the slice r[m:n] is a new list containing the elements from r[m] up to (but not including) the element r[n]. Here’s an example:
r = [10, 20, 30, 40, 50]
s = r[2:4]
print(s)
Note how r[4] (which has the value 50) is not included in the new list.
Slicing obeys convenient defaults, in that we can omit one of the integers in r[m:n] without adverse consequences. Omitting the first index is interpreted as using a first index of 0:
r = [10, 20, 30, 40, 50]
print(r[:3])
that is, it starts at the start of the list. We can combine this property of slicing with the afore-mentioned use of negative indices as follows:
r = [10, 20, 30, 40, 50]
print(r[:-1])
Clearly, this gives us all the elements of the list except for the last one. Similarly, omitting the second index is interpreted as using a second index equal to the number of elements:
r = [10, 20, 30, 40, 50]
print(r[2:])
that is, it ends at the end of the list.
Note that, when taking a slice, we can include a 3rd index: r[m:n:i]. This is to be interpreted as the stride. We start at r[m] and go up to (but not including) r[n] in steps of i. For example:
r = [10, 20, 30, 40, 50, 60, 70, 80, 90]
print(r[1:7:2])
When the third index is omitted (as above in r[m:n]), it is implied to be a stride of 1, namely every element is taken in turn, without skipping over any. This is as good a time as any to observe that sometimes we need to type in lines that are overly long. We may want to split those into two lines (or more), using Python’s line continuation character, which is a backslash. Thus,
r = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180]
is equivalent to:
r = [10, 20, 30, 40, 50, 60, 70, 80, 90, \
100, 110, 120, 130, 140, 150, 160, 170, 180]
due to the presence of the backslash. Obviously, writing just the second line without preceding it with a backslash:
100, 110, 120, 130, 140, 150, 160, 170, 180]
leads to a syntax error, as we’re closing a square bracket that never opened. Another way of getting line continuation (commonly preferred) is to wrap the relevant expression in parentheses. Thus:
r = ([10, 20, 30, 40, 50, 60, 70, 80, 90,
100, 110, 120, 130, 140, 150, 160, 170, 180])
is fully equivalent to using a backslash. Turning back to slicing, we note that it is one way that a Python list can grow:
r = [11, 7, 19, 22]
a = [1, 2, 3, 4, 5, 6, 7, 8]
r[1:3] = a
r
There were not enough slots from r[1] to r[2] to accommodate for all the elements in a, so the list r grew. Note that the original elements r[1] and r[2] themselves (7 and 19) were overwritten. On the other hand, note that when we say r[1:3] = a the right-hand side is also a list, meaning that we are providing an iterable entity (i.e., something which can be stepped through) to replace selected elements in r. In other words, this wouldn’t work if we simply used one number on the right-hand side:
r = [11, 7, 19, 22]
r[1:3] = 55
As we will see later, numpy arrays behave differently.
There are several built-in functions (applicable to lists) that often come in handy. They are most easily understood in action:
r = [11, 7, 19, 22]
len(r)
sum(r)
max(r)
min(r)
Especially sum() and len() will show up repeatedly in what follows.
Another handy built-in is the map() function, which applies a function the user provides to each element of a given list. For example, map(log,r) would create an iterable entity (we’ll see later how you could step through it). The first result of such a process would be log(r[0]), the second result log(r[1]), and so on. Note that you need to use the map() function to accomplish the task at hand: log(r) leads to an error, since log() takes in floats, not lists.
We already saw above that slicing is one way to grow a list. It is important to note that simply indexing beyond the end of the list and assigning doesn't work:
r = [11, 7, 19, 22]
r[4] = 8
Of course, attentive readers will have noticed that we can grow via slicing here:
r[3:] = r[3], 8
works, but isn’t really worth the trouble.
Instead, the way to add a new element at the end of the list is via the append() function, which is a member function/method of any list object and is accessed as follows:
r = [11, 7, 19, 22]
r.append(8)
print(r)
Similarly to our examples of math.pi or initialize.a above, we are here using the dot . to access functionality that is available for a given object: in math.pi we are accessing the constant pi that is part of the module math, whereas in r.append(8) we are accessing the function append() that is a part of the list object r.
We won’t be using this functionality, but for the sake of completeness we note that one can insert at another location (i.e., not at the end of a list), by using the insert(pos,val) method.
A way of growing a list that we will be using consists of creating a list from scratch (i.e., starting with an empty list) and then appending one element at a time:
r = []
r.append(50)
r.append(3)
print(r)
Since we talked so much about growing a list, we close by noting that one can also remove an element at a time as follows:
x = r.pop()
print(r)
print(x)
We see that pop() shortens the list by one element and also returns the element that is being removed (which we then assigned to the variable x). Finally, just like we can append either at the end of a list (via append()) or anywhere else (via insert(pos,val)), we also are not limited to returning elements from the end of the list (via pop()) but can delete any element via del r[pos]. Actually, the del statement also works with slices, so you can remove more than one element at a time.
We can concatenate two lists using +:
a = [1, 2, 3]
b = [4, 5, 6]
c = b + a
print(c)
Similarly, we can use * to create several repetitions of a list (and concatenate them) as follows:
b = [4, 5, 6]
d = 3*b
print(d)
This leads to a relatively common idiom, whereby a list can be populated with several (identical) elements using a one-liner:
g = 10*[1]
print(g)
While we routinely use the term “variable” to describe Python entities, one should keep in mind that this is a different concept than in many other languages (say, in C). In Python a new variable that is assigned to be equal to an old one, is simply the old variable by another name. For example:
a = [1,2,3]
b = a
b[0] = 7
a
This is possibly undesired behavior. In other words, as already mentioned above, in Python we’re not really dealing with variables, but with labels attached to values (since a and b are just different names for the same entity). This can be seen using the following tests:
a == b
a is b
which show that a and b both have the same value and are identical to each other. Note that here we have modified the value that both a and b label, via b[0] = 7, but both variable names still keep labelling the same object. The central entity here is the value, which the two variable names are merely attached to (like “sticky notes”). When we type b[0] = 7 we are not creating a new value, simply modifying the underlying entity that both the a and b labels are attached to.
Incidentally, things are different for simpler variables, e.g. x=1; y=x; y=7; print(x) prints 1 since 7 is a new value, not a modification of the value x is attached to. While initially both variable names were labelling the same value, when we type y=7 we create a new value (since the number 7 is a new entity, not a modification of the number 1) and then attach the y label to it.
Turning back to lists: one important fact we haven’t mentioned so far is that when you slice you get a new list, meaning that if you give a new name to a slice of a list and then modify that, then the original list is unaffected:
r = [11, 7, 19, 22]
sli = r[1:3]
sli
sli[0] = 55
sli
r
Likewise, if you modify the original list r, then the slice-by-a-new-name sli is unaffacted. As we will see later, numpy arrays behave differently.
This fact (namely, that slices don’t provide views on the original list but can be manipulated separately) can be combined with another nice feature (namely, that when slicing one can actually omit both indices) to create a copy of the entire list:
a = [1, 2, 3]
c = a[:]
c[0] = 33
c
a
Even without changing an element, we can see that slicing the entire list produces a copy by using the following tests:
a = [1, 2, 3]
c = a[:]
a == c
a is c
which show that a and c have the same value but are not identical to each other.
For the sake of completeness, note that the way of copying via slicing creates what is known as a shallow copy. If you need a deep copy, you should use the function deepcopy() from the standard module copy. In the case we’re studying here, there’s no difference between a deep and a shallow copy, but this may matter when you’re dealing with lists of lists.
Tuples are commonly described as immutable lists. (This is somewhat unfair to them, but will do for now). They are defined similarly to how one defines a list, the difference being that one uses parentheses instead of square brackets:
a = (1, 2, 3)
print(a)
Actually, you don’t even need to use the parentheses, as they are implied when absent:
b = 1, 2, 3
print(b)
Incidentally, we now see that our earlier example on swapping two values:
x, y = y, x
implicitly made use of tuples.
Tuple elements are accessed the same way that list elements are, namely with square brackets:
print(b[0])
As already mentioned, tuples are immutable, so they can neither change nor grow. For example:
b[0] = 7
The fact that they cannot grow means they have no append() method.
We can, however, concatenate two tuples using +:
a = 1,2,3
b = 4,5,6
b+a
where we produced a third tuple.
Strings can also be viewed as sequences. For example, we can access individual characters using square-bracket indexing:
phrase = "Hello, world!"
phrase
print(phrase)
phrase[0]
phrase[7]
where we incidentally also showed that print() strips the quotation marks. Like tuples, strings are immutable:
phrase[7] = "b"
and they are also not growable, i.e. they have no append() method.
As with tuples, we can use + to concatenate two strings, creating a new variable:
a = "My name is"
b = "Alex"
a + b
This shows that we have to be careful about spaces. Here’s an example that works:
a = "My name is"
b = "Alex"
a + " " + b
where we explicitly introduced a space in the middle.
Note that in the previous example we were dealing with two strings, "My name is" and "Alex". We were then forced to introduce a third string (a blank space), in order to produce an appropriate sentence-like space-separated concatenation. A more idiomatic way to do this goes as follows: first, create a list containing your strings. Then, use the join() string method:
a = "My name is"
b = "Alex"
vals = [a, b]
" ".join(vals)
Note that when using the join() method you first put the string separator (in the present case simply a space), then a dot, and then you pass in as an argument to join() a list of strings. We could have used a different separator, for example:
strL = ["one", "two", "three"]
"-".join(strL)
Keep in mind that join() takes in a list of strings (and a separator) and returns a string.
You will often need to also carry out the reverse process: starting from a string, split it into a list of strings using a specific separator. Most commonly, this task is necessary when we’re dealing with a sentence and would like to split it into its constituent words. Our goal can be accomplished by using the split() method of any string. For example:
phrase = "there are many words in here"
phrase.split()
Remember, we started with a string and ended up with a list of strings (i.e., the reverse of what join() did above). Implicit here is the fact that the default behavior of split() is to split according to spaces (actually also tabs, newlines, etc). This is because no argument is passed in to split(), as you can tell from the two consecutive parentheses, ().
If we are dealing with strings that need to be separated out using a different separator, then we simply pass in that separator as an argument to split():
numd = "one-two-three"
numd.split("-")
numc = "four,five,six"
numc.split(",")
The comma-separated case is pretty common in real-world situations.
We promised earlier that we would introduce “fancier” ways to format strings. We go over this now, while noting that for most of our purposes we don’t need much more than what we’ve seen up to this point. In addition to the simple string concatenation we saw above, Python has “old-style string formatting” (which looks like, e.g., %s as you may have seen in C-based languages) and “new-style string formatting”, which uses the format() string method which we focus on below.
Here’s an example of how format() works:
"{0} is my {1}".format("This", "sentence")
This uses two positional arguments, numbered starting from 0, within curly braces. Pay attention to the overall format: it is string-dot-format-arguments (similarly to the cases of string-dot-join-argument and string-dot-split-argument that we encountered above). We can straightforwardly use this approach to introduce the space in our example above:
a = "My name is"
b = "Alex"
"{0} {1}".format(a, b)
Note that the overall approach of format() is different to what we were doing before: we here provide a string with some “empty slots” that are filled in later when we provide arguments to format(). On the other hand, what we were doing before was: we were building up individual strings which were concatenated using + or join and possibly a separator as well.
We can play this game with numbers, too:
x, y = 3.1, -2.5
"{0} {1}".format(x, y)
Notice how easy it is in this approach to change the order of what we’re printing out:
x, y = 3.1, -2.5
"{1} {0}".format(x, y)
For the sake of completeness, we note that producing the string(s) using our earlier +-based approach would look something like this:
x, y = 3.1, -2.5
str(x) + " " + str(y)
str(y) + " " + str(x)
where we explicitly converted to string using the built-in function str() since we were trying to combine the two floats into a big string.
If that was all we were trying to accomplish, there would be no real benefit to using format() (not to mention that the curly braces and the explicit positional indices are confusing to many beginners). However, format() is a much more powerful way to format strings, providing us with extensive freedom in adding spaces, including more digits, and so on. We go over only two, very basic, examples.
First, we see how to introduce padding:
"{0:10} is my {1}".format("This", "sentence")
All we did was to put a colon and a number after the first (0-th) positional index. (We could have, obviously, done something similar for the other argument as well). Second, we see how to format floats in more detail:
"{0:1.15f} {1}".format(x, y)
Here we also introduced a colon, this time followed by 1.15f. This is to be parsed as follows: 1 gives the number of digits before the decimal, 15 gives the number of digits after the decimal, and f is a type specifier (that leads to the result shown for floats). We could have used the alternative type specifier e to get the result in scientific/engineering notation:
"{0:1.15e} {1}".format(x, y)
There are many other such options, but we won’t really be needing them.
Starting with Python 3.6, a new type of string literals is also possible, namely f-strings or formatted string literals. They are quite convenient, so you may want to look into them.
Python also provides support for dictionaries (often shortened to dicts), which are called associative arrays in some other languages (they’re called maps in C++). You can think of dictionaries as being similar to lists or tuples, but instead of being limited to integer indices, with a dictionary you can use strings or floats as keys. In other words, dictionaries contain key and value pairs. The syntax for creating them involves curly braces (compare with square brackets for lists and parentheses for tuples), with the key-value pair being separated by a colon. For example, here’s a dictionary associating heights to weights:
htow = {1.41: 31.3, 1.45: 36.7, 1.48: 42.4}
In this case both the keys and the values are floats. We access a dictionary value (for a specific key) by using the name of the dict, square brackets, and the key we’re interested in: this returns the value associated with that key. (In other words, indexing uses square brackets for lists, tuples, and dicts.) If the specific key is not present, then we get an error. For example:
htow[1.45]
htow[1.43]
Note, however, that accessing a key that is not present and then assigning actually works: this is a standard way key:value pairs are introduced into a dictionary. For example:
htow[1.43] = 32.9
print(htow)
Note that when printing out the dictionary we do not get the key:value pairs in the order in which we input them: this is because dictionaries are unordered (since we’re using the key to access the value, we don’t really care what order these key:value pairs are stored in). Starting with Python 3.6, there’s been a move toward preserving the insertion order: if you’re using an even later version, you should make sure to check this detail before you rely on it.
Note, also, that this behavior of accessing a key and assigning is very different from how lists grow. As you may recall, for lists r[4] = 8 was an error (and one needed to use the append() list method to add an element to the list).
The ability to add a key:value pair by indexing and assigning leads to the common idiom whereby one starts from an empty dictionary and then proceeds to populate it. For example:
nametoage = {}
nametoage["Jack"] = 42
nametoage["Sam"] = 33
nametoage["Mary"] = 38
print(nametoage)
nametoage["Jack"] += 1
nametoage["Jack"]
where we took the opportunity to also show that we can use strings as keys, this actually being a very common use case. In addition to this, we explicitly show that dictionaries are mutable: you can change a value after creation.
We encountered if statements and while loops above. We now turn to related topics, which also impact the flow of execution of a program.
Up to this point, we’ve spent quite a bit of time discussing how to create, grow, and concatenate lists. Very often, we need to carry out some sort of operation on the elements of a list: this is sometimes in order to create a new list, other times to change the values of the list elements “in place”, and yet other times just for the sake of a one-off calculation that's printed out on the spot. All of these tasks can be accomplished via the use of for loops. Here’s an example:
ns = [7, 4, 12]
for n in ns:
print(n, 2*n)
In this example, ns is a list of integers, whereas n is a given list element (i.e., an integer) each time. We could have employed a different name instead of n on the line containing the for loop, e.g., elem. Note that for loops also use indentation, which we used above (similarly to what we saw for if statements and while loops). We could have, obviously, included more lines of code inside the for block.
You may be used to other programming languages, where for loops typically repeat an action a given number of times: our example above is different, in that our for loop is iterating through the list elements themselves directly. (In other words, Python’s for is similar to the foreach that some other languages have.) There are situations, however, when you do need to repeat a certain action a fixed number of times. In that case, we use range():
for i in range(5):
print("Hello", i)
As you can see, this produces all the integers from 0 to 4, which we can then use for our purposes (range() isn’t really a built-in function, but for most intents and purposes you can treat it as if it was one). In Python 3, range() produces a range object, which we can even store in a new variable and then use:
thingy = range(4)
for i in thingy:
print(i**2)
This isn’t something that you will encounter very often: it makes more sense to include range() directly on the line containing the for. By the way, you may have wondered if we could be less wasteful regarding the number of output lines. There’s a simple way of placing all the output on the same line:
for i in range(4):
print(i**2, end=" ")
What we’ve done here is to say end=" " after passing in the argument we wish to print. This ensures that after we print each number we don’t include a newline but simply a space (and in the next iteration of the loop the next number with another space, and so on). You should keep in mind that there’s no newline added even after the last number (so if you need that, you have to add it yourself after the loop ends).
It’s worth noting that (in Python 3) range() produces numbers on demand, i.e., it doesn’t produce them all at once. If you do need them all at once, you can simply use the list() built-in function to create a list with all the numbers:
list(range(6))
Obviously, if you’re interested in iterating up to a very large integer, this can be wasteful, which is why range() gives you the numbers “as-you-go”.
The general form of how we invoke range() is similar to the list slicing in r[m:n:i] that we saw above: range(n) gives the integers from 0 to n-1, range(m, n) gives the integers from m to n-1, and range(m, n, i) gives integers from m to n-1 in steps of i. For example:
list(range(4, 17, 3))
Note that list slicing uses colons, whereas the arguments of range() are comma separated. Except for that, the pattern of start, end, stride is the same.
The combination of for loops and range() provides us with a powerful way to populate a list. For example:
powers = []
for i in range(1, 11):
powers.append(2**i)
print(powers)
Note that there exists an even simpler/more idiomatic way of accomplishing this task (using a list comprehension, as we’ll see below).
We cannot sufficiently stress the importance of the Python for loop. It is most certainly not limited to iterating through integers. Instead, it can be used to step through any sequence, such as a tuple:
a = (12., 2, "hello")
for elem in a:
print(elem)
where you should note that (just as in the case of the list ns above) we did not need to say a[i] anywhere. Similarly, we can iterate through a string (taking one character at a time):
word = "Hello"
for c in word:
print(c)
or even through a dictionary:
htow = {1.41: 31.3, 1.48: 42.4, 1.43: 32.9, 1.45: 36.7}
for h in htow:
print(h, htow[h])
As we will see in a later section, the Python for loop is so powerful that it can even be used to directly iterate through the lines of a given file.
If all our programs did was to carry out operations in sequence, inside several loops, their logic would soon become unwieldy. Instead, we are able to group together logically related operations and create what are called user-defined functions. In other words, while the math module contains a function called exp, we could create our own function called, say, newexp, which e.g. uses a different algorithm to get to the answer (or formats the answer differently).
The way we introduce our own functions is via the def keyword, along with a function name and a colon at the end of the line, as well as (the by now expected) indentation. There are many different kinds of functions you could create, so we will attempt to systematize their features a bit in what follows. We start with interactive Python, but when our examples get slightly longer we switch to program files (as is customary in real applications).
We can define a function that doesn’t do anything other than print out a message:
def justprint():
print("Hello, world!")
From your background in basic math (or from calling built-in functions or from other programming languages) you already know that the distinguishing feature of functions are the two parentheses (). In this case, our function (which is called justprint()) receives no parameters, so there’s nothing inside the parentheses. It is trivial to call this function:
justprint()
where we used no arguments in our call. We note here a distinction which is sometimes useful: a parameter appears in the definition of a function inside parentheses; an argument appears in a function call inside parentheses. Our justprint() uses no parameter in its definition and therefore received no argument when being called.
Note that this function does not return a value to the outside world, since all it does is print out a message. Even so, there’s nothing stopping you from assigning its output to a variable and then inspecting that variable:
x = justprint()
print(x)
Notice how after the function was called the message was printed onto the screen (as above). Then, when we printed out the value that the function returned we got None. This isn’t too surprising, given that our function did not return a value. This None is a constant that is part of core Python and is often used to represent the absence of a value (as in our case).
We now proceed to other functions, of increasing sophistication. The first thing to note is that (unlike our justprint() above), most of the time functions need to return a value, thereby communicating a calculation result to the external world. Here’s a function that prints out a message and returns a number to the external world:
def alsoreturn():
print("Hello, world!")
return True
This uses the keyword return. The value returned is a boolean variable. Note that this function always returns the same value, True, so it’s not very versatile. You call alsoreturn() as follows:
x = alsoreturn()
print(x)
We could have called this function without assigning its output to a variable, like so:
alsoreturn()
We observe that the return value was printed onto the screen: this is a peculiarity of using Python interactively. If we had had the same code in a file:
%%writefile alsoreturn.py
def alsoreturn():
print("Hello, world!")
return True
x = alsoreturn()
print(x)
alsoreturn()
then the second call (which isn’t assigned to anything) doesn’t print out True.
In any case, calling alsoreturn(), whether interactively or not, without assigning its return value to a variable means that that return variable is now lost to the rest of the program: it cannot be further manipulated in what follows. (There’s always a disclaimer: in the case of an interactive session, the special variable _ holds the result of the last executed statement, so you could access the value True that way).
For the sake of completeness, we note that our previous example of justprint() could have ended with a statement saying return None and would have been fully equivalent to its version without any return statement.
In most realistic cases we use functions that take in a value, do something with it, and then return another value. Here’s an example of a function that doubles its argument:
def double(x):
return 2*x
We call this function as follows:
y = double(7)
print(y)
We used a different name for the output variable only so that we don’t confuse the uninitiated. In reality, the following code is perfectly acceptable (and common):
x = double(7)
print(x)
Note that the definition of our function above was perhaps too concise if you’re an absolute beginner. We could have just as well introduced a new (intermediate) variable and then returned that:
def doublenew(x):
val = 2*x
return val
This new function is called similarly to how we called the previous one:
x = doublenew(7)
print(x)
and obviously gives the same value. The only difference was in internal organization.
Now that we can accept input parameters and return output values, there’s nothing keeping our functions from becoming more complicated (and more useful). For example, here’s a function that carries out the sum from 1 up to some integer:
%%writefile sumofints.py
def sumofints(nmax):
val = sum(range(1,nmax+1))
return val
for nmax in (100, 42):
x = sumofints(nmax)
print(nmax, x)
Note that we are taking in the integer up to which we’re summing as a parameter. We then ensure that range() goes up to (but not including) nmax+1 (i.e., it includes nmax). We also took an extra step in that our driver code (i.e., the part that calls the function sumofints) employs a loop that iterates through a tuple of possible integer values. Here’s the output of running this code:
%run sumofints.py
If you are still not fully comfortable with built-in functions like sum() and range(), you might appreciate seeing a hand-rolled version of the above functionality:
%%writefile sumofints2.py
def sumofints(nmax):
val = 0
for i in range(1,nmax+1):
val += i
return val
for nmax in (100, 42):
x = sumofints(nmax)
print(nmax, x)
Notice that here we have a loop inside our function (doing the summing) and also a loop outside, which calls our function repeatedly.
We can keep playing this game: functions can also have two (or more) parameters, combining them in a specified way to produce a return value. For example, here’s a function that evaluates the magnitude of a (2-component) vector:
%%writefile mag.py
from math import sqrt
def mag(x, y):
r = sqrt(x**2 + y**2)
return r
r = mag(1.2, 2.3)
print(r)
Since we can input two (or more) values as parameters, it stands to reason that we’d also be able to return two (or more) values. This is typically done by returning a tuple (or a list) that bundles together the two values.
%%writefile cartesian.py
from math import cos, sin
def cartesian(r, theta):
x = r*cos(theta)
y = r*sin(theta)
return (x, y)
a = cartesian(1.2, 0.1)
print(a)
x, y = cartesian(1.2, 0.1)
print(x, y)
With output:
%run cartesian.py
In the first call, we assign the returned tuple into the variable a. In the second call, we make use of multiple assignment as seen earlier to assign each element of the return tuple to a regular float variable. It’s easy to see that this approach can be generalized to return as many numbers as we want (bundled together).
We say that a variable that’s either a parameter of a function or is defined inside the function is local to that function. This means that when you change your variables’ values inside the function, you do not impact the value of the variables outside. For example:
%%writefile local.py
def f(a):
a += 1
b = 42
print('inside', a, b)
a = 1
b = 2
print('outside', a, b)
f(7)
print('outside', a, b)
prints:
%run local.py
If the original b = 2 assignment was missing, then the outside attempts to print b would fail (but the inside one is legal). If you’re still a beginner in programming, you have to keep in mind that even though the function f() was defined first, that does not mean that it was executed first. The actual lines of code are executed in sequence, starting from the first line after the function definition, namely a = 1.
The example above applies only to the case of immutable objects (like numbers) being passed in as arguments. If you pass in a mutable object, say a list, then you will be able to impact the external world. For example:
%%writefile passlist.py
def f(xs, ys):
xs[1] += 1
ys = ["hello", "bye"]
print("inside", xs, ys)
us = [7, 14]
vs = [-1, -3]
print("outside", us, vs)
f(us, vs)
print("outside", us, vs)
prints:
%run passlist.py
Note that the first list was changed after being manipulated (i.e., the same value was being referred to both inside and outside the function), but the second list was not changed, due to the fact that it was re-assigned inside the function (so a new value was created inside the function, therefore the one that was being referred to outside was not modified). Hopefully our discussion of Python variable names as "labels" helped you to see this behavior coming.
If you’re familiar with the terminology other languages use (pass-by-value or pass-by-reference), then note that in Python we pass by assignment, which for immutable objects behaves like pass-by-value (you can’t change what’s outside) and for mutable objects behaves like pass-by-reference (you can change what’s outside), if you’re not re-assigning.
It’s often a bad idea to change the external world from inside a function: it’s best simply to return a value that contains what you need to communicate to the external world. In some applications (especially in linear algebra) this can become wasteful/inefficient. Given the small scale of the problems we are solving, we will opt for conceptual clarity, always returning values without changing the external world. This is a style inspired by functional programming, which aims at avoiding side effects, i.e. changes that are not visible in the return value. (Unless you’re a purist, input/output is fine).
Python also supports nested functions and closures: though we won’t use these, it’s good to know they exist. On a related note, Python contains the keywords global and nonlocal.
We encountered the term positional argument above, when discussing fancy string formatting with format(). This is a more general term that applies to all cases where an argument is passed in and interpreted based on its order in the argument list. For example, for the case of cartesian() which we encountered above:
from math import cos, sin, pi
def cartesian(r, theta):
x = r*cos(theta)
y = r*sin(theta)
return (x, y)
a standard way to call this function using positional arguments is:
cartesian(1., pi/3)
which is similar to what we were doing above. Here Python knows that the first argument (1.) will correspond to the first parameter (r) and the second argument (pi/3) will correspond to the second parameter (theta).
There exists another way of calling a function, via what are known as keyword arguments:
cartesian(r=1., theta=pi/3)
where we explicitly mentioned the name of each variable (along with its value) when calling the function. Obviously, it would be an error to try to use a keyword argument with the wrong name:
cartesian(r=1., phi=pi/3)
since the definition of cartesian does not contain a parameter called phi. Keyword arguments allow us to change the order in which we place the arguments when calling:
cartesian(theta=pi/3, r=1.)
and still have them be interpreted correctly. We can even mix positional and keyword arguments:
cartesian(1., theta=pi/3)
though we should always ensure that positional arguments are specified before keyword arguments, because doing things the opposite way leads to an error:
cartesian(r=1., pi/3)
A related feature of Python is the ability to provide default parameter values when defining a function. Modifying our sumofints() function from above, we can write:
def defpar(nmax=100):
val = sum(range(1,nmax+1))
return val
This can be called either with or without an argument:
defpar()
defpar(100)
defpar(42)
In the first case, we are providing no argument, so the default parameter value is used, as is borne out by the result of the second case. Effectively, the argument here is optional. In the third case, we can override the default parameter value and pass in a desired argument. Similarly, we can combine the use of a default parameter value (upon definition) with the use of a keyword argument (when calling):
defpar(nmax=42)
just like we did above.
More generally, we can give default parameter values to all or only some of our parameters, for example:
def cosder(x, h=0.01):
return (cos(x+h) - cos(x))/h
This is typically used (in the trenches) to give to one of the parameters that value that is most often used (though if it’s the value that is always used it should be a constant, not a parameter). For example:
cosder(0.)
cosder(0.,0.01)
cosder(0.,0.05)
As should be expected, we can have a mixture of positional arguments, keyword arguments, and default parameter values. For example:
cosder(h=0.05, x=0.)
cosder(0., h=0.05)
cosder(x=0.)
As a matter of good practice, you should make sure to always use immutable default parameter values. (In other words, you might be confused if you give a list as the default parameter value.) Finally, note that in Python one has the ability to define a function that deals with an indefinite number of positional or keyword arguments. The syntax for this is *args and **kwargs, but a detailed discussion would take us too far afield.
One feature that makes the language especially pleasant to work with is that in Python functions are first-class objects. This means that we can pass them in as arguments to other functions. A very simple example that shows this is a modification of the function cosder() we encountered above:
def der(f, x, h=0.01):
return (f(x+h) - f(x))/h
Note how f is a regular parameter, but is used inside the function the same way we use functions (by passing arguments to them inside parentheses). This new function der() is called by passing in as the first argument the function of your choice, e.g.:
from math import sin, cos
der(cos, 0., 0.01)
or
der(sin, 0., 0.01)
Observe that we passed in the name of the function, sin or cos, as the first argument and the x as the second argument. This means that we did not pass in cos() or cos(x), as those wouldn’t work.
Before concluding this subsection, let us make some general comments about the conventions we will employ in the book which this tutorial is accompanying. We mentioned earlier that a function will be interacting with the external world only via its parameters (input) and its return value (output). Thus, we won’t be, e.g., modifying a list that was passed in to our function as an argument. A related question is whether or not we should access variables from inside a function that were defined outside it (and were not passed in as arguments). This is sometimes tempting: especially if you know you are not going to change that variable’s value inside the function, what’s the harm in simply accessing it? The motivation for this is a wish to keep the number of parameters small: instead of explicitly naming as parameters all the variables you’re going to need inside the function, you simply access them when the need arises. Unfortunately, this approach can lead to headaches down the line: it’s much easier to reason about what a function does when you can clearly see what input it takes in, what it does to the input, and what it communicates to the external world (through the return statement).
This is the approach we’ll employ in what follows, with the partial exception that we allow the calling of other functions from within our function, without feeling the need to pass these functions in as arguments. This will depend on the specific situation we’re faced with each time. For example, our function der(), which takes in as an argument the function to be differentiated, is more flexible (and therefore better) than cosder(), which is hard-wired to handle cosines exclusively. On the other hand, when we defined our function mag() we didn’t think twice about having it call the function sqrt(), without passing the latter in as an argument (which would help you if, e.g., you wanted to pass in your very own cruder-but-faster implementation of the square root). As a rule of thumb, you should pass a function in as an argument if you foresee that you might be passing in another function in its place in the future (as in the case of der()). If you basically expect to always keep carrying out the same task (as in the case of mag()), there’s no need to add yet another parameter to your function definition.
In the following example we see how to combine passing a function in as an argument with our earlier examples on default parameter values:
%%writefile funcdef.py
from math import exp
def f(x):
return exp(-x)
def g(x):
return exp(-x**2)
def transf(x,fin=f):
return fin(3*x)/x
x = 0.5
print("g")
print(g(3*x)/x)
print(transf(x,g))
print("f")
print(f(3*x)/x)
print(transf(x,f))
print(transf(x))
Most of this code should be straightforward to read: we define a couple of example functions, f() and g(), as well as a new function transf() that is designed to carry out a mathematical transformation. The need to carry out such a transformation appears commonly when integrating analytically (and changing variables). The only new feature of this code is that it uses f as the default value for the parameter fin. This means that you can call transf() with either two arguments (a number and a function) or only one argument (a number). This is borne out by the output of running this code:
%run funcdef.py
As this clearly shows, a manipulation such as g(3*x)/x in the main code is fully equivalent to calling our new function, transf(x,g). You can imagine that as our transformations get more involved it becomes wiser to define a new function like we did here (for a trivial case). What’s even more exciting is that the fin=f in the definition of transf() allows us to call transf() with only one argument (a number). This is incredibly useful if you want to use your transformed function as part of another piece of code, that expects your interface to be “I give you a number and you return a number”.
At this point, it’s worth bringing your attention to another idiom which you may encounter in the wild: there was no need to use a new name (i.e., fin) for the parameter of transf() that is a function. In other words, that could have been called simply... f, leading to a definition that looks like this:
def transf(x,f=f):
return f(3*x)/x
This is perfectly legal code: the first f in f=f (and inside the body of transf()) is a function-specific variable name, whereas the second f in f=f is the default parameter value, which in this case refers to the function f() that is defined elsewhere in the same file.
We saw earlier that the map() function can apply a function to each element of a given list. At the time, we saw this in the context of a built-in function, specifically log(). Now that we’ve seen how to create functions of our own, it stands to reason that we should attempt to use map() to apply our own functions to the elements of a list. This works as follows:
def double(x):
return 2*x
vals = [2.1, 3.4, 6.5]
mapvals = map(double,vals)
list(mapvals)
Note how we used the built-in list() to produce a list, starting from the output of map(). Obviously, this becomes even more interesting when you wish to apply more complicated transformations to the elements of the list.
Note that the code above is a replacement for the following hand-rolled code:
vals = [2.1, 3.4, 6.5]
newvals = []
for x in vals:
newvals.append(2*x)
print(newvals)
which is more cumbersome.
There’s another task that crops up fairly often: that of “pruning” a given list according to a specific criterion. For example, we might wish to start from a given list and produce a list that contains only those elements that are greater than 2. We would do this using the filter() built-in function:
def checkgt2(x):
if x>2:
return True
else:
return False
vals = [-1, 3.14, -2.7, -22, 7.8, 9, 14.6]
filvals = filter(checkgt2,vals)
list(filvals)
As the result shows, filter() applies the checkgt2() function to all elements in vals and keeps only those that pass the check. Again, we observe that the code above is a replacement for the following hand-rolled code:
vals = [-1, 3.14, -2.7, -22, 7.8, 9, 14.6]
filvals = []
for x in vals:
if x>2:
filvals.append(x)
print(filvals)
which is slightly more cumbersome and certainly not as re-usable in other scenarios. The above two cases (of map() and filter()) both appear to necessitate throwaway functions (that may not be necessary in the rest of the code). This is one instance where function one-liners that use the lambda syntax turn out to be helpful, but we won’t really be pursuing that avenue, since we’ll be introducing an alternative below.
List comprehensions (often shortened to listcomps) provide us with a Pythonic way of setting up lists very easily. Here’s a simple example:
xs = [0.01*i for i in range(10)]
print(xs)
This replaces the following hand-rolled loop, populating a list one element at a time:
xs = []
for i in range(10):
xs.append(0.01*i)
print(xs)
It’s easy to see that the code with the list comprehension is much more compact (1 line vs 3 lines). Note that when using a list comprehension the loop that steps through the elements of some other sequence (in this case, the result of stepping through range()) is placed inside the list we are creating! This syntax is a bit unusual, but well worth studying.
List comprehensions can function as replacements of map() and filter() functionality. As a result, they can replace the hand-rolled versions of the code, as well, without the need of introducing a new function (or even a lambda). They are very convenient and strongly recommended. For example, here’s how we would rewrite our map(double,vals) example from above (and the corresponding hand-rolled version):
vals = [2.1, 3.4, 6.5]
newvals = [2*x for x in vals]
print(newvals)
Notice that we didn’t need to create a function to do this. Again, the loop is placed inside the list brackets. Similarly, here’s how we would rewrite our filter(checkgt2,vals) example from above (and the corresponding hand-rolled version):
vals = [-1, 3.14, -2.7, -22, 7.8, 9, 14.6]
newvals = [x for x in vals if x>2]
print(newvals)
where, once again, we did not need to introduce a function. The new feature here is that, in addition to the loop inside the list we are creating, there’s also a conditional expression inside. It’s easy to see that list comprehensions are powerful, and could be even used to accomplish both tasks at the same time (double an element only if it is greater than 2):
vals = [-1, 3.14, -2.7, -22, 7.8, 9, 14.6]
newvals = [2*x for x in vals if x>2]
print(newvals)
where, as above, there was no need for a new function.
Early in this tutorial we saw how to implement input from and output to the screen, using input() and print(). We now turn to the more general problem of dealing not with the screen but with files.The one thing to remember is that when reading from or writing to files, everything is interpreted as a string. We provide more details below, in each case starting with a manual version (which is simpler) and then turning to an automatic version (containing a loop).
The standard pattern we will follow here is: first we open the file for reading, then we read, then we close the file.
Before carrying out any of the steps, we inspect the file that we will be reading in Python:
%%writefile compl.in
x y
0.6 -6.2
1.6 9.3
1.8 16.
Note that the first line contains characters, while the lines after that contain numbers. This is a very common situation in the real world (where the first line often explains what the columns of numbers stand for).
We will first read in this file interactively, line by line, so we get a feel for things:
f = open("compl.in","r")
f.readline()
f.readline()
line = f.readline()
line.split()
f.readline()
f.readline()
f.close()
The first line shows us how we open a file in Python: we use the open() built-in function, passing in two strings separated by commas (the first string is the name of the file we wish to open and the second string is "r" for reading). Intriguingly, we can assign the result of open() to a regular variable (which we here call f): this is consistent with the Python philosophy, according to which lists, functions, files can all be treated as regular objects.
We then use f.readline() to read one line at a time, noticing that each line string ends with a newline character \n (which makes sense, since each line starts on a new... line). The result of each f.readline() is a string. For one of the lines we explicitly save the result of f.readline() to a variable and then use split() on that string variable, seeing that the output is a list of strings (the x and y values for that line, but in string format – with the newline character \n being discarded, just like the spaces were). When we’ve reached the end of the file, f.readline() returns an empty string. We then proceed to close the file using the close() built-in function.
This was fine, but in real applications we don’t read files “by hand”, i.e., interactively. On the other hand, we often don’t really know beforehand how many lines are in a given file. Python has an idiomatic way of reading a file one line at a time (without needing to know how many lines there are in total in the file). This is the for line in f: idiom. Notice how smooth this is: files can be iterated with the standard for syntax applied to a file object. This underlines the versatility of Python loops (which apply to lists, tuples, dictionaries, and files among many other things).
What we are really interested in doing is reading in the values in the file and then printing them out to the screen with a catch: while we will be leaving the first column of numbers untouched, we will be applying a function to the numbers in the second column. We can accomplish this with the following code:
%%writefile fileread.py
def myf(x):
return 3*abs(x)**1.7
f = open("compl.in","r")
line = f.readline()
for line in f:
linelist = line.split()
xstr = linelist[0]
ystr = linelist[1]
x = float(xstr)
y = float(ystr)
print(x, myf(y))
f.close()
First, we define the function we’ll apply to our y values. Then we open the file and read one line, not doing anything with it: we know that in this specific case the first line contains the x y labels, so we discard those. We then use a Python loop that gives us one line string at a time (we call this line but we could have called it something else). Inside the loop, we use split() to split the line into a list containing the numbers in string format. As we noted above, everything having to do with Python file reading and writing will be in string format and it is our responsibility to convert to other types if we need to (in our case we need to apply the function myf() to some floats, not to strings). Thus, we index into that list: the 0th element each time holds the x value while the next (and last) element holds the y value. We then convert those to floats, apply the function we want, and print out (to the screen) a new table of results. At the end, we close the file.
It’s interesting to note that in this example we used several intermediate variables (like linelist, xstr, and x) to clarify what’s going on. Some programmers might like to shorten the whole loop down to:
def myf(x):
return 3*abs(x)**1.7
f = open("compl.in","r")
line = f.readline()
for line in f:
print(float(line.split()[0]), myf(float(line.split()[1])))
f.close()
but this is certainly more difficult to read (for humans). An intermediate solution (neither too many throwaway variables nor too much on one line) is probably optimal here.
Note that in a scenario where you have only numbers in the input file (no characters, say, on the first line) you can use for line in f: to read and process every single line in the file (i.e., you wouldn’t need the first f.readline()).
Note, finally, that there also exists a less elegant way of reading in (and saving to a variable) the entire file at one go, using f.readlines() (notice the plural):
f = open("compl.in","r")
content = f.readlines()
print(content)
f.close()
This places the entire contents of the file into a list of strings. You should generally avoid this approach: the file may be very large, in which case reading it all in first is a wasteful thing to do. It’s much better to process one line at a time, as needed.
The standard pattern we will follow here is: first we open the file for writing, then we write, then we close the file. We carry each one of these steps in turn:
f = open("dummy.out","w")
f.write("This is my first sentence.\n")
f.write("And this is my second sentence.\n")
line = "Yes, I started the previous sentence with ‘And’.\n"
f.write(line)
f.close()
Note, first, that since we want to write to this file, we are opening it with "w". We then used the write() method of our file object f to write one string at a time. Notice that we needed to include newline characters explicitly at the end of each string. Had we not done this, then the different strings would have all been written on the same line. In our example we also took the opportunity to store one of the strings in a variable called line, only to highlight that f.write() takes in a string as an argument. It is important to appreciate this fact: something like f.write(x) will give an error, unless x is a string.
A pattern that you will encounter repeatedly is to write a table of x and y values into a file (with a space between the two values in each line). Assume your x values are already stored in a list xs. Then you apply some complicated function to them to produce your y values. You then need to produce a string representing the line, which will consist of x converted to a string, then a space, then y converted to a string, and then a newline character. By looping through all the elements in xs you will write out all the needed lines to the file. All that’s left is to close the file.
%%writefile filewrite.py
def complicated(x):
return 4*x**3 - 7
xs = [0.2*i for i in range(10)]
f = open("compl.out","w")
for x in xs:
y = complicated(x)
line = str(x) + " " + str(y) + "\n"
f.write(line)
f.close()
Note how we used +’s to concatenate the strings, building up one long string that makes up a line consisting of two space-separated values and a newline character. This produces the following file:
%run filewrite.py
%load compl.out
0.0 -7.0
0.2 -6.968
0.4 -6.744
0.6000000000000001 -6.135999999999999
0.8 -4.952
1.0 -3.0
1.2000000000000002 -0.08799999999999653
1.4000000000000001 3.9760000000000026
1.6 9.384000000000004
1.8 16.328000000000003
The attentive reader will have realized that the string assignment:
line = str(x) + " " + str(y) + "\n"
could have been replaced by the alternative syntax:
line = "{0} {1}\n".format(x, y)
employing the format() method. Remember: there’s no way around needing to explicitly include the newline character in the output string.
In summary, we observe that whether doing file input or output, we had to call a built-in function: when reading a file we used float() to go from string to float, whereas when writing to a file we used str() to go from float to string.
You should keep in mind that there are other options beyond "r" and "w" when opening files, but we won’t be using them. Furthermore, we observe that here (as elsewhere in Python) we could have used single quotes instead, ‘r’ and ‘w’.
It’s worth noting that a (more fool-proof) way of opening/closing a file for reading or writing involves the Python with statement. This has the advantage of closing the file properly even if something goes wrong while processing the file (in technical jargon, even if “an exception is raised”). For pedagogical clarity (i.e., in order to avoid a further level of indentation) we avoided the use of with above, but it’s recommended you employ it in your own work.
In this tutorial and in the accompanying book, we’ve been using Python 3: this is the latest version, to which new features are still being added. There are several differences between the two versions, but for our purposes the main points are that in Python 2:
Of these, the most important two differences for us are: 1) print() being a function, and 2) division of integers giving a float. You can explicitly bring those Python 3 features into Python 2 code with the appropriate imports, as we now explain.
The next few lines are the only ones where we assume that you are working in Python 2. In short, placing the following two lines at the start of your program allows your Python 2 code to be Python 3-compatible.
from __future__ import print_function
from __future__ import division
In this tutorial and in the book which it accompanies, all our Python codes and results correspond to Python 3. Python 2 is actually no longer supported as of January 1st, 2020. If, for some reason, you need to use Python 2 and would still like to follow along, then the above two import statements should be all you need for the core Python material.
Modify the folllowing program:
f1,f2 = 1,1
while f2<1000:
print(f2)
f1,f2 = f2,f1+f2
so that it now prints out not the numbers up to 1000, but the first 100 such numbers.
You should use a Python dictionary wordtocount to accomplish the following tasks. First type in (or paste in) the folllowing assignment:
sentence = "this is a rambling sentence that simply goes on and on and on and just simply will not stop that is just the way things are"
Now:
Create and a print out a table of word populations. That means that you should show in the first column a word from this sentence and in the second column the number of times that specific word appears in the sentence. In your printout each word should appear only once. Hint: you might find it helpful to employ the get() method of Python dictionaries: wordtocount.get(mykey,0) returns wordtocount[mykey] if mykey is found and 0 otherwise.
Now print out lines saying how many words in sentence appear once, how many twice, and how many thrice. (You should have three lines of output, one for each value of the population). Hint: we’re not asking you to check which words appear, e.g., thrice, only to print out how many words appear thrice.
Check that you haven’t left out any words (or overcounted), namely, make sure that the total number of words in sentence (whether distinct or not) is equal to the number of words that appear once added to two times the number of words that appear twice, and three times the number of words that appear thrice.
Write a Python program to sum up the integers 1, 3, 5, .... 999.
Write a Python program to sum up the integers 1, 3, 5, .... up to a large odd number that the user provides.
Write a Python program to sum up the integers 1, 3, 5, .... up to each large odd number that the user provides. You should define and call a function in your code. The program should repeat over and over (with the user inputting a large odd number each time) and should terminate when the user inputs an even number.
Rewrite the previous program so that it reads several odd numbers (i.e., at least 4-5 of them) from a file odd_in.dat (one number per line) and then writes out to a file odd_out.dat as follows: each line contains a counter (1, 2, and so on), next to it the odd number that was input, and next to that the sum 1, 3, 5, .... up to to the odd number that was input.
In addition to your code, make sure to also show the specific odd_in.dat you read from and the specific odd_out.dat you produced.
Write a Python program that reads in from a file called dummy1.dat which has the following content:
hello world 21.2 12.1
how do 1.7 123.5
you obfuscate 0.004 19.5
(but your code should also work if it contained many more similar lines). Now write out to a file dummy2.dat as follows: for each line in the input file you should print out a corresponding line containing a counter (1, 2, and so on), next to it the second number from the input line, and next to that the exponential of that number. After that, the program should write out to dummy2.dat the sum of all the numbers contained in the input file dummy1.dat (regardless of whether they were first or second on a given line).
Reading a table of space-separated x & y values is a sufficiently common task that it makes sense to create a function to carry it out in a consistent way each time.
Create a function called readtable() that takes in only a string containing the file name and returns two lists of numbers, one for the xs and one for the ys. Feel free to re-use portions of fileread.py when writing your new function.
Now write a new function that carries out the same task but uses a list comprehension to read the file (once). Just like the previous one, this function should return two lists, one for the xs and one for the ys.
This problems deals with the function $f(x) = e^{−x^4}$. Print out a table of x and f(x) values, where the x goes from -1 to +1 in steps of 0.1. In your output, the x values should have only one digit after the decimal point, whereas the f(x) values should only have 5 digits after the decimal point.
Note: Your implementation should employ a user-defined function, list comprehensions, zip(), as well as fancy string formatting.
Alex Gezerlis -- http://www.numphyspy.org