Chapter 1: Introduction to Python
Python is one of the few programming languages that has truly reshaped how people think about code. Originally created in the late 1980s by Guido van Rossum as a simple scripting language, it has since evolved into one of the world’s most influential and versatile tools, powering everything from web servers and data analysis pipelines to artificial intelligence systems and automation scripts. Its enduring success comes from a rare combination of clarity, consistency, and sheer practicality.
At its heart, Python was designed to let programmers focus on the problem being solved, not on the syntax of the language itself. Its clean, readable structure makes code feel more like prose than machinery, and its rich standard library provides an enormous range of functionality right out of the box. Whether you’re writing a ten-line automation task or a production-scale application, Python’s philosophy of simplicity and explicitness stays constant.
This book assumes you already know the basics of programming; how variables work, what loops do, and why functions matter. We’ll start from first principles, explaining how to set up your environment and run simple scripts, then gradually move through the language’s core ideas and idioms. The goal is not to overwhelm, but to help you see how Python’s design encourages clear, efficient thought.
As we progress, you’ll learn not only the rules of the language but also the reasoning behind them. Python is often described as a “batteries included” language, meaning that most of what you need for everyday programming is already built in. Understanding how those parts fit together will make you a more confident and expressive developer in any field you apply it to.
By the end of this first chapter you’ll be comfortable running Python interactively, executing scripts, creating and using virtual environments, and following the most important stylistic conventions described in PEP 8, the style guide that shapes how Python code looks and reads. Each of these topics will be introduced in the sections that follow.
Who this book is for
This book is written for anyone who already understands the fundamentals of programming but wants to develop a fluent, practical command of Python. If you’ve written code in another language, whether that’s JavaScript, C, C++, Java, PHP, or anything else, you’ll find the syntax familiar, yet the philosophy refreshingly different. Python rewards clear thinking and concise expression, and once you become comfortable with its approach, it often becomes the language you reach for first.
It’s also intended for readers who may have already experimented with Python but now want a deeper and more complete reference, one that explains not just how things work, but why they work that way. You’ll see not only examples of syntax, but also consistent, real-world demonstrations that make each feature memorable and immediately useful.
Unlike beginner texts that start from absolute zero, this book won’t stop to define what a variable or an array is, or explain the concept of a loop in detail. Instead, it focuses on Python’s distinctive behaviour, its dynamic typing, its indentation-based structure, its built-in data types, and its readable, expressive syntax. Each topic is shown in context, with examples that can be adapted to real projects rather than artificial exercises.
If you already use Python occasionally and need a single, well-organised guide that can serve as both tutorial and reference, you’ll find that purpose here as well. The chapters are arranged progressively but can also be used independently, making the book useful both for straight-through reading and for quick look-up when you’re solving a specific problem.
Above all, this book is for working programmers who value clarity, portability, and speed of learning. Those who want to master Python as a professional tool for building reliable, elegant code across domains ranging from scripting and automation to data processing and web development.
What Python is and why it matters
Python is a high-level, general-purpose programming language created with one overriding goal: to make code easy to read, write, and reason about. Where many languages prioritise performance or syntax compactness, Python places human understanding at the centre of its design. The result is code that often looks like structured English, direct, uncluttered, and immediately intelligible to anyone who knows how to think computationally.
Originally developed by Guido van Rossum in the late 1980s, Python began as a simple scripting tool to automate repetitive system tasks. Over the decades it has grown into one of the most influential programming languages in the world, used in web development, data science, machine learning, scientific computing, network programming, and even education. Its success lies in its balance between power and approachability: complex ideas can be expressed cleanly, while beginners can start writing useful programs in minutes.
Python’s guiding philosophy is captured in the famous “Zen of Python”,a set of aphorisms that express the language’s values. Among them are lines such as “Readability counts”, “Simple is better than complex”, and “There should be one (and preferably only one) obvious way to do it.” These principles have shaped Python’s evolution and help explain why so many developers, even those fluent in other languages, find themselves preferring its straightforward, unambiguous style.
Another reason Python matters is its enormous ecosystem. The standard library that ships with every installation provides modules for tasks ranging from file handling to web access and data compression. Beyond that, the open-source community contributes hundreds of thousands of third-party packages through the Python Package Index (PyPI). Whether you need to build a website, analyse data, automate a workflow, or experiment with artificial intelligence, chances are the building blocks already exist.
Perhaps most importantly, Python’s accessibility has opened programming to millions of new learners and professionals from other fields (scientists, artists, teachers, analysts) who now use code as a creative instrument. Its emphasis on readability, community, and versatility continues to make it one of the most valuable skills a modern developer or problem solver can possess.
Versions and environment setup (3.10 - 3.13)
At the time of writing, the main supported versions of Python begin with 3.10 and go through 3.13. A new version, 3.14, has just been released and is fully stable
We begin at Python 3.10 because earlier versions contain significant syntax or language-feature differences that make them less suitable for a modern-style reference. For example, structural pattern matching was introduced in 3.10 and becomes a core idiom in later versions. By targeting 3.10 and above we ensure the examples and idioms you learn are current and forward-compatible. At the time of writing, the main supported versions of Python begin with 3.10 and go through 3.13.
Setting up your environment is straightforward. First you install a recent Python release (3.10, 3.11, 3.12 or 3.13) from the official source (python.org/downloads). Then you create a virtual environment dedicated to your project so you keep dependencies isolated, avoid version conflicts and make your code easier to manage.
Creating and using virtual environments
A virtual environment is a self-contained workspace that holds its own copy of the Python interpreter and any packages you install. This separation prevents different projects from interfering with one another. Without it, installing or upgrading a library for one project could accidentally break another. Virtual environments also make your setup reproducible, since the exact dependencies for each project are stored together.
To create a new virtual environment, open a terminal or command prompt and run:
python -m venv myenv
This command creates a folder named myenv containing everything needed for an isolated Python installation. The next step is to activate it so your shell uses the environment’s interpreter and package paths instead of the global ones.
On Windows, activate it with:
myenv\Scripts\activate
On macOS or Linux, use:
source myenv/bin/activate
Once activated, your command line prompt will usually show the environment’s name in parentheses, indicating that you are working inside it. From that point onward, any packages you install with pip will go into this environment alone:
pip install requests
When you finish working, you can return to the system’s default Python by deactivating the environment:
deactivate
Each project can have its own environment with its own dependencies, ensuring consistency between development, testing, and deployment. Many editors, including VS Code and PyCharm, automatically detect and use these environments once created, so managing them quickly becomes part of a smooth, everyday workflow.
Once you have a virtual environment active you install any required packages via pip, configure your IDE or editor as needed, and verify your setup by running a simple script or using the interactive REPL. This ensures that you are ready to work with the language and tools before diving into the chapters ahead.
Testing your setup
After activating your virtual environment and installing any packages you need, it’s good practice to verify that everything works as expected. The quickest way is to open the interactive REPL (Read–Eval–Print Loop) by typing python at the command prompt. You’ll see a short banner with version information and a prompt symbol >>>. From there, you can enter simple expressions to confirm that Python is running correctly:
>>> print("Hello, Python!")
Hello, Python!
If you see this output, your installation and environment are working properly. To leave the REPL, type exit() or press Ctrl + Z (Windows) or Ctrl + D (macOS/Linux).
You can also test your setup by saving a short script in a file named test.py inside your project folder:
print("Your Python environment is ready!")
Run it with:
python test.py
If the message appears, your environment, interpreter, and editor are all correctly configured and ready for the examples in the chapters that follow.
Running scripts and the REPL
Python can be used in two main ways: interactively through the REPL (Read–Eval–Print Loop), or by running complete scripts saved in files. The REPL is ideal for quick experimentation and testing ideas, while scripts let you build full programs that can be reused, shared, and executed on demand.
To start the REPL, open a terminal or command prompt and type python. You’ll see a short banner showing the Python version, followed by the prompt >>>. At this prompt you can type expressions or statements and see their results immediately. For example:
>>> 3 * 7
21
>>> print("Hello, Python!")
Hello, Python!
>>> for i in range(3):
... print(i)
0
1
2
The REPL is especially useful for trying out syntax, checking small code fragments, or exploring built-in functions and modules. You can use the help() function for instant documentation, or import a module and experiment with its contents interactively:
import math
help(math)
math.sqrt(25)
5.0
When you want to create a complete program, write your code in a text file with a .py extension, for example hello.py:
print("Hello from a Python script!")
Save the file, then run it from the command line using:
python hello.py
The program runs and exits when finished, printing its output to the terminal. You can use any text editor or IDE that supports plain text to create scripts — Visual Studio Code, PyCharm, Sublime Text, or even a simple editor like Notepad. The REPL and script execution use the same interpreter and syntax, so anything that works interactively will also work in a saved file.
Editing and formatting conventions (PEP 8 essentials)
Python’s style guide, known as PEP 8 (Python Enhancement Proposal 8), defines the conventions that make Python code clean, consistent, and readable. These are not strict rules enforced by the interpreter, but they are widely followed across the entire Python community. Writing code that adheres to PEP 8 helps ensure that other developers can understand and maintain your work easily, and it makes your own code clearer when you return to it later.
PEP 8 recommends using four spaces per indentation level rather than tabs. Mixing tabs and spaces can cause indentation errors, so most editors are configured to insert spaces automatically when you press the Tab key. Indentation is not just a matter of style in Python — it defines the structure of your code, so consistency is essential.
Line length should generally stay below 79 characters, which keeps code readable on narrow displays and avoids horizontal scrolling. Blank lines should be used sparingly to separate logical sections, such as between function definitions or class declarations. You should also leave a single space on either side of assignment and comparison operators to improve legibility:
# Good style
total = price + tax
# Poor style
total=price+tax1
When naming things, follow Python’s clear naming patterns. Use snake_case for functions and variables (calculate_area), PascalCase (also known as CamelCase) for classes (DataProcessor), and UPPER_CASE for constants (MAX_SIZE). Avoid using single letters except for simple loops or short-lived variables. Good names act as self-documentation, reducing the need for comments that merely repeat what the code already says.
Comments, when used, should explain the why rather than the what. PEP 8 suggests full sentences with proper punctuation, starting with a capital letter. For longer explanations, use docstrings (triple-quoted strings placed immediately after a function, class, or module definition). These can be accessed later through the built-in help() function.
def greet(name):
"""Return a friendly greeting for the given name."""
return f"Hello, {name}!"
Most editors and IDEs can automatically check your code for PEP 8 compliance. Tools such as flake8 or black integrate into your workflow to highlight or even fix formatting issues automatically, keeping your codebase consistent with minimal effort.
Chapter 2: Foundations of the Python Language
Every programming language has its own rhythm—its way of arranging thought into structure, logic into form. Python’s rhythm is unusually clean. It invites clarity not by constraint, but by design. Its rules are simple enough to learn in a day, yet subtle enough to reward a lifetime of practice.
At its heart, Python treats readability as a discipline. Code that looks simple is often simpler to reason about, and Python’s syntax enforces that discipline from the start. Instead of symbols and ceremony, it prefers whitespace and meaning. The structure you see is the structure that runs.
Understanding this foundation is about more than just learning rules; it’s about learning how Python thinks. Each line reflects an underlying philosophy that explicit is better than implicit, names carry weight, and everything lives within a clear and logical space. When you see Python code, you are reading a small story written in order and intention.
This chapter begins to shape that awareness. It explores how Python arranges ideas, how names and values interact, and how its scoping rules give coherence to what might otherwise seem fluid. These ideas form the language’s backbone, defining how Python code behaves at every level.
By mastering these foundations, you begin to see why Python feels both rigorous and humane, why its syntax disappears when your thoughts align with it, and why those who learn it rarely return to languages that speak in harsher tones.
Statements vs. expressions
Every line of Python does something, but not all lines do it in the same way. Some lines perform an action; others produce a value. The difference lies between statements and expressions.
An expression is something that can be evaluated to yield a result. It has a value that Python can use within a larger operation. A statement is a complete instruction—something Python executes rather than evaluates. You can think of an expression as a phrase, and a statement as a full sentence.
When you type 3 + 4, Python computes and returns the value 7. That is an expression. When you write print(3 + 4), Python performs an action: it displays a result. That is a statement. The first produces data; the second produces an effect.
Some expressions can appear inside statements, and many statements depend on expressions to do their work. The line between them is functional rather than absolute. For example, an if statement uses an expression to decide which block of code to run, but the if itself is not an expression as it doesn’t yield a value. Understanding this distinction helps when you read or write code that mixes both ideas. Expressions build the flow of data, and statements shape the flow of control. Together, they form the pulse of Python’s execution model.
Comments and docstrings
Python code is meant to be read by people as well as machines. Comments and docstrings are how you speak directly to those people, your future self included. They do not change what the program does, but they shape how it is understood.
A comment begins with a hash mark (#) and continues to the end of the line. Python ignores it completely. Comments are best used to explain why something is done, not what the code is doing. Clear code often needs little commentary, but when reasoning or design choices are not obvious, a short note can make all the difference.
A docstring, short for documentation string, lives inside triple quotes ("""like this""") and is attached to a module, class, or function. Unlike comments, docstrings are stored at runtime and can be accessed through the help() system or the .__doc__ attribute. They serve as part of the program’s living documentation:
def greet(name):
"""Return a friendly greeting.
Args:
name: The person’s name as a string.
Returns:
A message that greets the person.
"""
return f"Hello, {name}!"
Here the triple-quoted string describes what the function does, what its parameter means, and what value it returns. You can view it directly in an interactive session:
>>> help(greet)
Help on function greet in module __main__:
greet(name)
Return a friendly greeting.
Args:
name: The person’s name as a string.
Returns:
A message that greets the person.
Because docstrings are visible to tools, they are often written in a consistent format. The first line gives a short summary; following lines may describe arguments, return values, or examples. This convention lets others understand a function without reading its full implementation.
Thoughtful commenting and concise docstrings create a dialogue between code and reader. They remind us that programs are not just instructions for a computer—they are also explanations for other humans who will one day need to know what we meant.
help() or the .__doc__ attribute.
Variables and assignment
In Python, variables are not boxes that contain values. They are names that point to objects held elsewhere in memory. When you assign a value to a variable, you are binding a name to an object, not copying the object itself. This distinction shapes how data behaves throughout a program.
Consider the line x = [1, 2, 3]. The name x now refers to a list object stored in memory. If you then write y = x, both x and y point to the same list. Changing the contents through one name affects what the other sees. Assignment links names; it does not duplicate data.
x = [1, 2, 3]
y = x
y.append(4)
print(x) # [1, 2, 3, 4]
Immutable objects such as numbers, strings, and tuples behave differently because they cannot be changed once created. When you “modify” them, Python actually creates a new object and binds the name to that new object. This is why a = 5 followed by a = a + 1 does not alter the original number 5. It simply makes a point to 6 instead.
Understanding this binding model helps avoid subtle bugs, especially when working with mutable objects like lists or dictionaries. Variables are not containers; they are labels that direct Python to the current object associated with a given name.
Dynamic typing and name binding
Python is a dynamically typed language, which means that names are not bound to a fixed type. A variable’s type is determined by the object it refers to at a given moment, and that binding can change freely as the program runs. This flexibility makes Python expressive and concise, though it also demands care in reasoning about what each name refers to.
When you write x = 10, the name x is bound to an integer object. If you later write x = "ten", the same name now points to a string. The original integer still exists somewhere in memory until Python’s garbage collector reclaims it, but x no longer refers to it. The binding between names and objects is fluid.
x = 10
print(type(x)) # <class 'int'>
x = "ten"
print(type(x)) # <class 'str'>
This design makes Python adaptable. You can reuse variable names in different contexts, and functions can operate on any object that behaves as expected, regardless of its type. This approach, known as “duck typing,” emphasizes behavior over declaration. If an object acts like a list or a file, Python treats it as one.
Dynamic typing encourages flexibility but also calls for discipline. Because names can change meaning, clear naming and consistent conventions become vital. The freedom to bind any object to any name is powerful, but with that freedom comes the responsibility to write code that remains predictable and clear to others.
count = 5
count = count + 1 # count is now 6
count = "six" # name now points to a string
total = count + 4 # TypeError: cannot add a string and a number
Namespaces and scope
Every name in Python lives inside a namespace, which is a mapping between names and the objects they reference. You can think of a namespace as a labeled environment, keeping track of what each name currently means. Python manages several such environments at once, which together determine how and where a name can be found.
The concept of scope describes the visibility of a name within a program. When Python encounters a name, it looks for it in a specific order known as the LEGB rule: Local, Enclosing, Global, and Built-in. This sequence defines how Python resolves names from the most specific context to the most general.
When you use a name inside a function, Python first checks the Local scope (the function’s own variables). If it’s not found there, Python looks one level out, in any Enclosing scopes (functions that contain this one). If still not found, it checks the Global scope (module-level names). Finally, it searches the Built-in scope, which holds standard names like len and print.
x = "global"
def outer():
x = "enclosing"
def inner():
x = "local"
print(x)
inner()
outer() # prints "local"
In this example, each assignment of x creates a new binding within a different scope. When print(x) runs, Python finds x in the local scope of inner() and stops searching. If that line were removed, Python would look outward through the enclosing, global, and built-in namespaces until it found a match.
Chapter 3: Data Types and Structures
Everything in Python is an object, and every object belongs to a type. Understanding those types (and how they interact) forms the basis of writing reliable, elegant code. Data in Python is not just stored; it carries meaning, behaviour, and relationships that shape how programs work and communicate.
This chapter explores how Python represents information. From simple numeric values to collections that can hold other objects, each type contributes to a flexible and expressive model of data. The same principles that make Python readable also make its data structures intuitive: they feel natural to use because they reflect how people already think about lists, mappings, and sequences.
Unlike many languages that require you to declare variable types in advance, Python determines them at runtime. This dynamic model lets you work quickly and focus on what your code expresses rather than how it must be declared. But it also rewards understanding as knowing what kind of object you are dealing with helps you write cleaner, faster, and more predictable code.
Built into Python are a small number of powerful data types that cover almost every use case. They can be combined, nested, or transformed to model complex structures with minimal effort. Mastering them is essential, because these types form the vocabulary of all Python programs.
Core types
Python’s built-in types cover numbers, text, binary data, sequences, mappings, sets, and the special “no value” object. These are the types you will use every day, and they are available without any imports. Knowing their literal forms and basic behaviour makes code shorter, clearer, and faster to read.
- Numbers: integers (
int), floating point numbers (float), and complex numbers (complex). Literals are direct and readable:42,3.14,1+2j. Arithmetic mixes types as needed, with results chosen sensibly for the operation. - Booleans:
TrueandFalse. These behave like integers in arithmetic contexts (Trueequals 1,Falseequals 0), but are intended for logic, conditions, and control flow. - Text:
strrepresents Unicode text. Literals use single or double quotes, with escapes for special characters:"hello",'π',"line\nbreak". Strings are immutable, so operations create new strings rather than altering existing ones. - Binary:
bytesandbytearray.bytesis immutable,bytearrayis mutable. These types hold raw 8-bit data and are essential for files, networking, and protocols. - Sequences:
list(mutable),tuple(immutable), andrange(an efficient sequence of integers). Lists use square brackets, tuples use parentheses (or commas), ranges come fromrange(start, stop, step). - Mappings:
dictstores key–value pairs with fast lookup. Literals use braces with colons between keys and values:{"a": 1, "b": 2}. Keys must be hashable, values can be any object. - Sets:
set(mutable) andfrozenset(immutable) represent unordered collections of unique elements, ideal for membership tests and deduplication. - Absence:
Noneis a singleton that represents the lack of a value. It has its own type (NoneType) and is commonly used as a default or sentinel.
# Literals and basic forms
n = 42 # int
x = 3.5 # float
z = 1+2j # complex
flag = True # bool
text = "hello" # str
data = b"\xDE\xAD" # bytes
buf = bytearray(b"hi") # bytearray
nums = [1, 2, 3] # list
point = (10, 20) # tuple
r = range(5) # 0..4
caps = {"A": 65, "B": 66} # dict
letters = {"a", "b", "c"} # set
frozen = frozenset({1, 2})
missing = None # NoneType
[] instead of list(), {} for dicts with items, () or comma tuples). Use constructors when converting types or building from iterables.
None are immutable, so operations return new objects instead of altering originals (see the following section).
None and truthiness
None is Python’s way of expressing “nothing here.” It is a special singleton object used to represent the absence of a value, the result of a function that returns nothing, or a placeholder meaning “not yet assigned.” Only one instance of None exists in any running program, and comparisons should always use is rather than ==:
x = None
if x is None:
print("x has no value")
is None and is not None when testing for the absence of a value. Comparing with == None can lead to subtle bugs if an object defines its own equality method.
Unlike 0, an empty string, or an empty list, None is not a value that evaluates to false by content because it simply represents the absence of content altogether. Its type, NoneType, exists only for this single object. Python uses a clear and consistent rule for deciding what counts as “true” or “false” in a Boolean context. Any object can be tested for truth value using an if or while statement. By default, the following evaluate as false:
NoneFalse- Zero of any numeric type (
0,0.0,0j) - Empty sequences or collections (
"",[],(),{},set())
Everything else is considered true. This behaviour, known as truthiness, makes it easy to write natural, readable conditions:
name = ""
items = [1, 2, 3]
if not name:
print("Name is missing")
if items:
print("There are", len(items), "items")
Here, the empty string counts as false, while the non-empty list counts as true. You can make your own classes participate in this system by defining the __bool__() or __len__() method to control how they evaluate in Boolean contexts.
None both evaluate as false, but they do not mean the same thing. Use explicit checks when the distinction matters.
Python sequences
Sequences are ordered collections of items that can be accessed by position. In Python, the main built-in sequence types are lists, tuples, and ranges. They share a common interface: you can index, slice, iterate, and measure them with len(). This consistency makes Python easy to learn and its data structures predictable.
Lists
A list is a mutable sequence that can hold any combination of objects. Lists are created with square brackets and can grow or shrink dynamically. They are ideal for collections that will change during execution.
fruits = ["apple", "banana", "cherry"]
print(fruits[0]) # apple
fruits.append("date") # add to the end
fruits[1] = "blueberry" # modify in place
print(fruits) # ['apple', 'blueberry', 'cherry', 'date']
Lists support slicing, concatenation, repetition, and membership tests:
numbers = [1, 2, 3, 4, 5]
print(numbers[1:4]) # [2, 3, 4]
print(numbers + [6]) # [1, 2, 3, 4, 5, 6]
print(3 in numbers) # True
append(), extend(), and remove() modify them in place and return None. This avoids confusion between mutating and creating new sequences.
Tuples
A tuple is an immutable sequence, written with parentheses or commas. Once created, its contents cannot change. Tuples are often used to group related data or return multiple values from a function.
point = (3, 4)
print(point[0]) # 3
# unpacking
x, y = point
print(x, y) # 3 4
Because tuples are immutable, they can serve as dictionary keys or elements of sets, unlike lists. They also offer a small performance advantage when creating fixed collections.
(3) is just the number 3, but (3,) is a one-element tuple.
Ranges
A range represents a sequence of integers, often used for looping. It doesn’t store all numbers in memory but generates them on demand, making it efficient even for large ranges.
for i in range(3):
print(i) # 0, 1, 2
r = range(2, 10, 2)
print(list(r)) # [2, 4, 6, 8]
Ranges can be sliced, compared, and tested for membership, but they are immutable and support only integers. They provide a predictable, memory-friendly way to represent arithmetic progressions.
range() instead of creating lists when you just need to iterate through numbers. It’s faster and uses almost no memory, since values are generated as needed.
Together, lists, tuples, and ranges form the backbone of Python’s sequence model. They share common features like indexing and iteration but differ in mutability and purpose. Lists are flexible, tuples are fixed, and ranges are efficient numeric sequences.
Sets and dictionaries
While sequences maintain order and position, sets and dictionaries organize data by content and association. They are both built on Python’s fast hash table implementation, which gives them efficient lookups and membership tests. Sets store unique elements, while dictionaries store key–value pairs that map one piece of data to another.
Sets
A set is an unordered collection of unique items. Duplicate elements are automatically removed, and membership checks are extremely fast. Sets are created with braces or the set() constructor.
colors = {"red", "green", "blue"}
colors.add("yellow")
colors.add("red") # ignored (already present)
print(colors) # {'red', 'green', 'blue', 'yellow'}
print("green" in colors) # True
Sets support mathematical operations such as union, intersection, and difference, making them powerful tools for comparing or combining collections:
a = {1, 2, 3, 4}
b = {3, 4, 5}
print(a | b) # union: {1, 2, 3, 4, 5}
print(a & b) # intersection: {3, 4}
print(a - b) # difference: {1, 2}
Because sets are unordered, their contents may appear in different orders when printed. If you need a fixed, immutable version, use frozenset, which behaves like a regular set but cannot be changed.
Dictionaries
A dict (dictionary) maps keys to values. Each key must be unique and hashable, while values can be of any type. Dictionaries are created with braces containing key–value pairs separated by colons.
person = {"name": "Ada", "age": 36, "job": "engineer"}
print(person["name"]) # Ada
person["age"] = 37 # update
person["city"] = "London" # add new key
print(person)
Accessing a missing key raises a KeyError, but the get() method lets you supply a default value instead:
print(person.get("country", "unknown")) # unknown
Dictionaries are iterable, returning keys by default. You can also iterate over values or key–value pairs with .values() and .items():
for key, value in person.items():
print(key, "→", value)
As of Python 3.7, dictionaries preserve insertion order, so items appear in the same sequence they were added. This makes them suitable for structured data, configuration objects, and lightweight records.
{x: x**2 for x in range(5)} creates {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}.
Together, sets and dictionaries complete Python’s core collection types. Sets handle uniqueness and membership, while dictionaries express relationships between pieces of information. Both are fundamental tools for organising and reasoning about data efficiently.
Type conversion and coercion
Python makes it easy to convert between types when it makes sense to do so. This process, known as type conversion, allows data to move smoothly between contexts. Conversion can be explicit, where you call a function like int() or str(), or implicit, where Python automatically adjusts types during an operation, otherwise known as coercion.
Explicit conversion
Explicit conversion (or casting) uses built-in constructor functions to create a new object of the desired type. This is the safest and clearest approach, because the transformation is intentional and visible to readers of the code.
x = "42"
y = int(x) # string → int
z = float(y) # int → float
s = str(z) # float → string
print(x, y, z, s) # '42' 42 42.0 '42.0'
int(3.9) truncates toward zero, not toward the nearest whole number. Use round() when rounding is required.
You can also convert sequences and collections using constructors like list(), tuple(), set(), or dict() where appropriate:
nums = (1, 2, 3)
lst = list(nums)
print(lst) # [1, 2, 3]
letters = "abc"
print(set(letters)) # {'a', 'b', 'c'}
Implicit conversion (coercion)
In certain arithmetic operations, Python automatically converts operands to a compatible type so that the calculation can proceed. For example, combining an integer and a float produces a float result, ensuring that precision is not lost.
a = 5
b = 2.0
c = a + b
print(c) # 7.0
print(type(c)) # <class 'float'>
This kind of conversion is limited to compatible numeric types. Python does not automatically mix unrelated types, such as adding a string and an integer. Attempting to do so raises a TypeError rather than producing an ambiguous result.
age = 30
msg = "Age: " + str(age) # correct
# "Age: " + age # TypeError
Type conversion and coercion reflect Python’s balance between flexibility and safety. It automates simple cases but leaves ambiguous ones to the programmer, ensuring that conversions are deliberate where meaning could be lost.
Copy vs. reference
When you assign a variable in Python, you are not copying a value. Instead you are creating a new reference to an existing object. This distinction is fundamental to understanding how mutability works. Mutable objects (like lists, dictionaries, and sets) can be changed through any reference that points to them, while immutable objects (like strings, tuples, and numbers) cannot be altered once created.
a = [1, 2, 3]
b = a
b.append(4)
print(a) # [1, 2, 3, 4]
In this example, a and b both point to the same list. The append() call changes the underlying object, so both names reflect the update. This is a shared reference not duplication. To create an independent copy of a mutable object, you must make an explicit copy. There are two kinds of copying: shallow and deep. A shallow copy creates a new outer object but keeps references to the same inner objects. A deep copy creates new copies of everything recursively.
import copy
original = [[1, 2], [3, 4]]
shallow = original.copy()
deep = copy.deepcopy(original)
original[0][0] = 99
print(original) # [[99, 2], [3, 4]]
print(shallow) # [[99, 2], [3, 4]] (inner lists shared)
print(deep) # [[1, 2], [3, 4]] (fully independent)
Most of the time, a shallow copy is enough—use methods like list.copy() or slicing (a[:]) to duplicate lists, and dict.copy() for dictionaries. When your data structures contain nested mutable objects, use copy.deepcopy() to avoid unexpected coupling between layers.
Understanding the difference between copies and references makes debugging easier and helps prevent subtle state-related bugs. Python’s object model is simple once grasped: names bind to objects, and those objects may or may not be mutable. However, names themselves are always just references.
Chapter 4: Operators and Expressions
Operators are the symbols that let Python combine values, compare objects, and control logic. Together with expressions, they form the language’s working grammar—the way ideas are turned into action. Understanding how operators behave is key to writing code that is both clear and efficient.
Every operation in Python produces a value, even those that seem to exist only for their effect. Expressions combine values and operators to compute results, while statements often use those results to drive control flow or perform actions. This blend of simplicity and precision is what makes Python both expressive and readable.
Python’s operators are grouped by purpose: arithmetic, comparison, logical, bitwise, membership, and identity. Each follows predictable rules and precedences, meaning that once you learn the pattern for one category, the rest fall naturally into place. Parentheses can always be used to make the intended order explicit.
Unlike many languages, Python’s operators work with a wide range of object types. The same symbol can mean numeric addition for numbers, concatenation for strings, or merging for lists. This flexibility comes from special methods like __add__() and __eq__(), which define how objects respond to operators.
This chapter explores how expressions are built, how precedence and associativity control evaluation order, and how Python’s operators provide a compact yet expressive vocabulary for computation, comparison, and logic. Mastering them will help you write code that communicates clearly—both to the interpreter and to other readers.
Arithmetic, comparison, and logical operators
Operators are at the heart of expression syntax in Python. They define how values combine, compare, and interact. The most common groups are arithmetic operators (for mathematical computation), comparison operators (for testing relationships), and logical operators (for combining conditions). Each group has clear, predictable behaviour that applies across many object types.
Arithmetic operators
Arithmetic operators perform mathematical operations on numeric types and, where appropriate, on sequences like strings or lists. Python uses familiar symbols for these operations:
+addition-subtraction*multiplication/true division (always returns a float)//floor division (rounds down to nearest integer)%modulus (remainder)**exponentiation
a = 10
b = 3
print(a + b) # 13
print(a - b) # 7
print(a * b) # 30
print(a / b) # 3.333...
print(a // b) # 3
print(a % b) # 1
print(a ** b) # 1000
Arithmetic operators can also act on compatible sequence types. For instance, "a" * 3 repeats a string, and [1, 2] + [3] concatenates lists. These operations follow the same symbolic pattern as numeric arithmetic but apply meaningfully to collections.
/ operator always produces a float, even if both operands are integers. Use // when you need integer division that discards the remainder.
Comparison operators
Comparison operators evaluate relationships between two values, producing True or False. They are essential for control flow and condition testing:
==equal to!=not equal to<less than<=less than or equal to>greater than>=greater than or equal to
x = 5
y = 8
print(x == y) # False
print(x != y) # True
print(x < y) # True
Comparisons can be chained for readability, following standard mathematical style:
age = 25
print(18 <= age < 30) # True
Python evaluates such chains without repeating the variable, and stops as soon as a comparison fails. This keeps conditionals concise and expressive.
Logical operators
Logical operators combine or invert Boolean expressions. They control the flow of logic in conditionals and loops:
andreturnsTrueif both operands are trueorreturnsTrueif either operand is truenotinverts the truth value
logged_in = True
is_admin = False
if logged_in and not is_admin:
print("User access granted")
Unlike purely Boolean operators in other languages, Python’s and and or return one of their operands rather than a strict Boolean value. This behaviour supports common idioms:
name = "" or "Anonymous" # evaluates to "Anonymous"
status = "active" and 1 # evaluates to 1
and expression is false, or the first part of an or expression is true, Python skips the rest because the outcome is already determined.
Arithmetic, comparison, and logical operators together define how Python computes, tests, and decides. They form the essential vocabulary for writing expressions that describe both numerical and logical relationships within a program.
Identity and membership
Beyond arithmetic and logic, Python provides two small but powerful families of operators that express relationships of a different kind: identity and membership. They do not compare values in the usual sense—they ask whether something is a specific object, or whether it belongs to a collection.
Identity operators
Identity operators test whether two names refer to the same object in memory. They are especially important when dealing with mutable objects or special singletons such as None. The operators are:
is: True if both references point to the same objectis not: True if they point to different objects
a = [1, 2, 3]
b = a
c = [1, 2, 3]
print(a is b) # True (same object)
print(a is c) # False (different object, same content)
print(a == c) # True (values are equal)
The difference between is and == is subtle but essential. The == operator compares the values of two objects, while is checks whether they are literally the same object in memory. Most of the time, equality is what you want, but is is correct when testing for None or other singletons.
is None and is not None for null checks. This is both faster and clearer than equality comparison, since there is only one None object in any Python process.
Membership operators
Membership operators check whether an element exists within a sequence, set, or mapping. They read naturally, almost like plain English, and return True or False depending on the presence of the item.
in: True if the element is foundnot in: True if the element is not found
letters = ["a", "b", "c"]
print("a" in letters) # True
print("z" not in letters) # True
For dictionaries, the membership test applies to keys, not values. To check whether a value exists, use the .values() method explicitly.
person = {"name": "Ada", "age": 36}
print("name" in person) # True (key check)
print("Ada" in person.values()) # True (value check)
Membership testing is highly efficient for sets and dictionaries because these types use hash tables for storage. For lists, tuples, and strings, the operation scans sequentially, so its cost grows with size.
in checks membership, not identity. It asks “is this element contained here?”, not “is this the same object?”.
Together, identity and membership complete Python’s expressive set of relational operators. They let code describe relationships between objects and collections with precision that reads as naturally as English, which is one of the hallmarks of Python’s design philosophy.
Bitwise operators
Bitwise operators act directly on the binary representation of integers. Instead of working with whole numbers as quantities, they manipulate the individual bits that make those numbers up. These operators are common in low-level programming, networking, graphics, and performance-critical code where compact control of data is important.
TypeError. Always ensure operands are integers or explicitly convert them before use.
Each integer in Python is stored in binary form, using ones and zeros. Bitwise operators perform logical operations on those bits position by position:
&bitwise AND (1 only if both bits are 1)|bitwise OR (1 if either bit is 1)^bitwise XOR (1 if bits differ)~bitwise NOT (inverts all bits)<<left shift (moves bits left, filling with zeros)>>right shift (moves bits right, discarding shifted bits)
a = 6 # binary 110
b = 3 # binary 011
print(a & b) # 2 (010)
print(a | b) # 7 (111)
print(a ^ b) # 5 (101)
print(~a) # -7 (bitwise inversion)
print(a << 1) # 12 (shift left)
print(a >> 1) # 3 (shift right)
Left and right shifts effectively multiply or divide integers by powers of two, which can be useful in optimisation or when working with binary data streams. For example, n << 3 is the same as n * 8. Bitwise operations can be visualised more easily by converting numbers to binary strings using bin():
print(bin(6)) # 0b110
print(bin(3)) # 0b11
print(bin(6 & 3)) # 0b10
Although Python integers can grow arbitrarily large, bitwise operators still behave as if working on an infinite string of bits in two’s complement form. This explains why ~6 becomes -7: the inversion flips all bits and changes the sign.
value & 0xFF isolates the lowest 8 bits of value.
Even though bitwise logic may seem niche in everyday Python programming, it reveals the level beneath the surface—the raw binary patterns that all computation rests on. Knowing how these operators behave gives you precise control when you need it, and a deeper understanding of how Python represents and manipulates data at its most fundamental level.
Assignment expressions
Until Python 3.8, assignment was always a statement, not an expression. That meant you could not assign a value as part of another expression. You had to do it in a separate line. The assignment expression operator, written as := and often called the walrus operator, changes this. It allows you to assign a value to a name as part of a larger expression, returning that value at the same time. This makes certain patterns more compact and expressive, especially in loops and conditionals where a value needs to be both tested and reused.
# without the walrus operator
line = input("Enter text: ")
while line != "":
print("You said:", line)
line = input("Enter text: ")
# with the walrus operator
while (line := input("Enter text: ")) != "":
print("You said:", line)
Here, the input is both captured and tested in a single expression. The operator assigns the result of input() to line and also returns that result for comparison. The parentheses are required because the operator has lower precedence than most comparisons and arithmetic operations.
Assignment expressions are most useful when a value would otherwise need to be computed twice, or when combining assignment and condition logic improves clarity. They should be used sparingly and only when the result reads naturally.
if (n := len(items)) > 0:
print(f"{n} items found.")
Assignment expressions make Python slightly more expressive without compromising its simplicity. They enable concise patterns where a value is both needed and tested, bridging the small gap between statements and expressions in the language’s design.
Operator precedence and associativity
When an expression contains several operators, Python must decide which parts to evaluate first. This order is called operator precedence. Operators with higher precedence bind more tightly, meaning they are evaluated before those with lower precedence. When operators share the same precedence level, associativity determines whether evaluation proceeds from left to right or right to left.
Parentheses can always be used to make order explicit, and doing so is often best for clarity. But knowing the general precedence rules helps you read expressions with confidence and understand why Python produces a given result. The following list shows Python’s operator precedence from highest to lowest. Operators on the same line share the same level and are evaluated according to their associativity.
Highest to lowest precedence
| Operator | Description |
() | Grouping or function calls |
x[index], x(attr) | Subscripts, slicing, attribute access |
** | Exponentiation (right-associative) |
+x, -x, ~x | Unary plus, minus, bitwise NOT |
*, /, //, % | Multiplication, division, floor division, modulus |
+, - | Addition, subtraction |
<<, >> | Bitwise shifts |
& | Bitwise AND |
^ | Bitwise XOR |
| | Bitwise OR |
in, not in, is, is not, <, <=, >, >=, ==, != | Comparisons, membership, identity |
not | Logical NOT |
and | Logical AND |
or | Logical OR |
if–else | Conditional expressions |
:= | Assignment expression (walrus) |
=, +=, -=, *=, /=, //=, %=, **=, &=, |=, ^=, <<=, >>= | Assignment operators (right-associative) |
yield, yield from | Yield expressions |
return, lambda | Return, lambda expressions |
Associativity determines how operators of equal precedence are grouped. Most Python operators are left-associative, meaning evaluation proceeds from left to right. Only exponentiation (**), assignment operators, and the walrus operator (:=) associate from right to left:
print(2 ** 3 ** 2) # 2 ** (3 ** 2) = 512
This precedence hierarchy defines how all of Python’s operators fit together. Understanding it turns what might look like a dense expression into a predictable and readable sequence of operations. In practice, parentheses and good naming will always keep your intent explicit, which is the real goal of Pythonic style.
Chapter 5: Strings in Depth
Strings are among the most important and versatile objects in Python. They are the primary way the language represents text, and they form the bridge between human meaning and program logic. Whether you are displaying messages, processing data, or formatting output, you will work with strings constantly. Understanding how they behave (especially their immutability, slicing semantics, and formatting tools) is essential for writing clear, reliable code.
At first glance, a string seems simple: a sequence of characters enclosed in quotes. But beneath that simplicity lies a carefully designed structure that supports Unicode, efficient storage, and a wide range of operations. Strings can be indexed, sliced, concatenated, compared, and iterated like any other sequence. They also provide a rich set of methods for searching, transforming, and analysing text.
In this chapter we look at how strings are written, quoted, and represented in code, including multi-line and raw strings, and how to access and manipulate substrings safely (and why strings cannot be changed in place).We also look at the standard toolkit for transforming and examining text using built-in methods and loops, and how to interpolate values and control layout using modern and legacy formatting styles, as well as how Python stores and processes text from all writing systems to work efficiently with bytes and encodings.
Literals and escape sequences
Every string in Python begins as a literal, which is a piece of text written directly in your source code. String literals can be enclosed in single quotes ('...'), double quotes ("..."), or triple quotes ('''...''' or """..."""). The different forms exist for convenience and readability, not for meaning. All create the same kind of str object.
single = 'Hello'
double = "World"
triple = '''This can span
multiple lines.'''
Triple-quoted strings are especially useful for documentation and long text blocks, since they preserve line breaks and indentation exactly as written. They are also the standard way to create multi-line docstrings.
Strings can include almost any printable character directly, but certain characters, such as quotes, newlines, or tabs, need special handling. Python uses escape sequences to represent these. An escape sequence begins with a backslash (\) followed by one or more characters that indicate the desired symbol or control code.
text = "Line one\nLine two\tTabbed"
print(text)
# Output:
# Line one
# Line two Tabbed
Here, \n represents a newline and \t represents a tab. These sequences are interpreted at runtime, not stored literally, so the string’s internal value contains the actual control characters.
Common escape sequences
| Sequence | Meaning |
\\ | Backslash |
\' | Single quote |
\" | Double quote |
\n | Newline |
\t | Horizontal tab |
\r | Carriage return |
\b | Backspace |
\f | Form feed |
\uXXXX | Unicode character (4 hex digits) |
\UXXXXXXXX | Unicode character (8 hex digits) |
\xXX | Character with hex value XX |
\N{name} | Unicode character by name |
These escapes make it possible to include any Unicode character in a string, even ones that cannot be typed directly. For instance, "\u03C0" produces the Greek letter π, and "\N{EM DASH}" inserts an em dash.
print("\u03C0 =", 3.14159) # π = 3.14159
"\q" raises a SyntaxError because \q is not a recognised escape.
Raw strings
Sometimes you want a string to contain backslashes literally, without Python interpreting them as escapes. This is common in regular expressions, Windows file paths, and other patterns that use many backslashes. To prevent escape processing, prefix the literal with r or R to create a raw string:
path = r"C:\Users\Robin\Documents"
print(path) # C:\Users\Robin\Documents
In a raw string, the backslash is treated as an ordinary character. The only exception is that a raw string cannot end with a single backslash, because it would escape the closing quote.
\n or \t to be processed.
String concatenation and repetition
Adjacent string literals written together are automatically concatenated by the compiler, even without a + operator. This can help split long strings across lines neatly:
message = (
"Python strings "
"can join automatically "
"when written this way."
)
print(message)
Strings can also be concatenated or repeated explicitly using operators:
a = "Py"
b = "thon"
print(a + b) # Python
print(a * 3) # PyPyPy
Concatenation creates a new string, as all strings are immutable. The repetition operator (*) is a simple and efficient way to produce repeated patterns or padding.
+ will raise a TypeError. Convert non-string values explicitly with str() before concatenation.
Together, literals, escapes, and raw forms give Python’s strings both expressive range and precise control. They allow text to be written naturally in code while preserving full compatibility with every character Python’s Unicode engine can represent.
Indexing, slicing, and immutability
Indexing, slicing, and immutability together define how Python strings behave as ordered, unchanging sequences. You can retrieve and combine parts freely, but the sequence itself remains stable and reliable throughout the life of your program.
Strings in Python are sequences, which means each character occupies a specific position that can be accessed directly. Every string has a defined order starting from index 0 for the first character and increasing by one for each character that follows. Negative indices count backward from the end, with -1 referring to the last character, -2 to the second-last, and so on.
word = "Python"
print(word[0]) # P
print(word[5]) # n
print(word[-1]) # n
print(word[-2]) # o
Indexing retrieves a single character as a one-character string. Attempting to access an index beyond the valid range raises an IndexError, so it is good practice to confirm length with len() when needed.
name = "Rachel"
print(len(name)) # 6
Slicing
Slicing extracts a portion of a string by specifying a range of indices in square brackets. The syntax s[start:end] returns the substring beginning at start and continuing up to but not including end. Either bound may be omitted, in which case it defaults to the beginning or end of the string.
text = "Pythonic"
print(text[2:3]) # tho
print(text[:2]) # Py
print(text[3:]) # nic
Slices can also include a third value, the step, which controls the interval between selected characters. A step of 2 takes every second character, and a negative step reverses the direction.
text = "Acrobat"
print(text[::2]) # Arbt
print(text[::-1]) # taborcA
These operations always return a new string. They never alter the original, because strings are immutable. This means that once a string has been created its contents cannot change.
Immutability
Attempting to assign a new value to an individual character or slice causes an error because strings do not support modification in place. The correct approach is to build a new string instead.
word = "Fable"
# word[0] = "T" # TypeError: 'str' object does not support item assignment
new_word = "T" + word[1:]
print(new_word) # Table
Although this may seem restrictive it actually simplifies reasoning about code. Because a string’s contents never change, you can safely pass it between functions without worrying that it will be altered elsewhere. When a transformation is needed, Python constructs a new string and binds the result to a name, leaving the original untouched.
String methods and iteration
Strings come with a large set of built-in methods for inspecting, searching, and transforming text. Because strings are immutable, each method that appears to modify a string actually returns a new one, leaving the original unchanged. These methods make it easy to clean data, adjust case, find patterns, and perform substitutions without writing additional loops or functions.
String methods and iteration form a complete toolkit for everyday text handling in Python. They allow you to clean, transform, and examine text clearly and efficiently, using constructs that read almost like natural language.
Basic transformations
The most common methods deal with the overall shape of the text—changing case, removing whitespace, or joining and splitting substrings.
name = " Ada Lovelace "
print(name.strip()) # remove leading and trailing spaces
print(name.upper()) # ADA LOVELACE
print(name.lower()) # ada lovelace
print(name.title()) # Ada Lovelace
strip() removes whitespace from both ends, while lstrip() and rstrip() remove it from only one side. Case conversion methods (upper(), lower(), title(), capitalize(), swapcase()) all return new strings with the desired transformation.
Searching and replacing
Python provides simple, readable methods for locating and substituting text within a string. These include find(), replace(), startswith(), and endswith().
text = "data-driven design"
print(text.find("data")) # 0
print(text.replace("data", "user")) # user-driven design
print(text.startswith("data")) # True
print(text.endswith("sign")) # True
find() returns the index of the first occurrence or -1 if not found. replace() produces a new string with all matches replaced. These methods handle the most common search operations without requiring explicit loops.
in when you only need to check for the presence of a substring, since it reads naturally and is faster than find() for simple checks.
"drive" in text # True
Splitting and joining
Two of the most powerful and frequently used string methods are split() and join(). split() breaks a string into a list of parts, while join() combines a list of strings into one, using the string it is called on as a separator.
sentence = "one two three"
words = sentence.split()
print(words) # ['one', 'two', 'three']
joined = "-".join(words)
print(joined) # one-two-three
By default, split() uses any whitespace as a delimiter, but you can specify another separator. The reverse method join() is the preferred way to build strings from lists because it is both efficient and expressive.
Character tests
Several methods return Boolean values based on the content of the string. These methods are useful for validation, filtering, and user input checks.
isalpha(): Only lettersisdigit(): Only digitsisalnum(): Letters or digitsisspace(): Only whitespaceisupper(): All uppercaseislower(): All lowercase
code = "ABC123"
print(code.isalnum()) # True
print(code.isalpha()) # False
These checks are often used in conditions and comprehensions to clean or validate text input.
Iteration over strings
Because strings are sequences, they can be iterated character by character in a for loop or comprehension. Each iteration yields a one-character string.
for ch in "ABC":
print(ch)
This property lets you use strings naturally with many Python constructs that expect iterable objects, including sum(), max(), min(), sorted(), and generator expressions.
letters = "vector"
print(sorted(letters)) # ['c', 'e', 'o', 'r', 't', 'v']
join() rather than manual concatenation inside loops. This approach is faster and uses less memory.
Iteration also supports slicing and membership tests, which combine to make string processing in Python both concise and intuitive.
f-strings and formatting techniques
Formatting strings is one of the most practical skills in Python. It allows you to combine variables and text into readable messages, structured reports, or dynamic output. Over time, Python has introduced several ways to do this. The modern and most preferred method is the formatted string literal, or f-string, introduced in version 3.6. Older techniques, such as the str.format() method and the percent (%) operator, still exist for compatibility and are sometimes useful in specialised cases.
f-strings
An f-string is a string literal prefixed with f (or F). Inside it, you can place expressions within curly braces, and Python will evaluate them at runtime, inserting the resulting values into the string. The syntax is compact, fast, and easy to read.
name = "Ada"
language = "Python"
print(f"{name} loves {language}.")
# Ada loves Python.
Anything inside the braces can be a valid Python expression, including calculations or function calls.
x = 7
y = 3
print(f"{x} + {y} = {x + y}")
# 7 + 3 = 10
You can also apply formatting options directly within the braces using a colon followed by a format specifier. This controls how numbers, dates, and other values appear.
pi = 3.14159265
print(f"{pi:.2f}") # 3.14
print(f"{pi:10.3f}") # field width 10, 3 decimals
Formatting codes resemble those from the older format() method and give precise control over alignment, width, precision, and fill characters.
| Specifier | Meaning |
f | Fixed-point decimal |
.nf | Display with n decimal places |
> | Right-align in a field |
< | Left-align in a field |
^ | Center-align |
0 | Pad with zeros |
, | Use commas as thousands separators |
amount = 12345.678
print(f"{amount:,.2f}") # 12,345.68
print(f"{amount:^15.1f}") # centered in 15-character field
str.format() method
Before f-strings, Python’s main approach to formatting was the format() method. It uses braces as placeholders inside the string, which are replaced by positional or keyword arguments. Although slightly more verbose than f-strings, it remains powerful and widely used in libraries and templates.
template = "Hello, {}. Welcome to {}."
print(template.format("Ada", "Python"))
# Hello, Ada. Welcome to Python.
You can reference arguments by position or name, giving flexibility when reusing templates.
print("{1} and {0}".format("bread", "butter")) # butter and bread
print("{name} scored {score}".format(name="Alan", score=95))
Formatting specifiers can also be added after a colon, just like with f-strings.
print("Value: {:.3f}".format(3.14159)) # Value: 3.142
Percent formatting
The oldest style of formatting in Python uses the percent operator (%) much like C’s printf(). It still works and can be useful when working with legacy code or porting examples from older texts.
name = "Ada"
age = 36
print("Name: %s, Age: %d" % (name, age))
Here, %s formats a string, %d an integer, and %f a floating point number. Although concise, this method is less flexible and more error-prone than modern alternatives, as mismatched placeholders and arguments can cause runtime errors.
Choosing a formatting style
All three systems ultimately serve the same goal: combining text with dynamic values. f-strings are usually best for inline expressions and readability. The format() method works well when templates must be reused or passed as data. Percent formatting remains available for older codebases. The important thing is to be consistent within a project so readers can understand at a glance how values are being inserted.
Encoding and Unicode handling
Every string in modern Python represents a sequence of Unicode characters. This design allows Python to handle text from virtually any language or symbol set without loss of information. Unicode is a universal standard that assigns every character (letters, digits, punctuation, emojis, and more) a unique number called a code point. Python’s str type stores these code points internally, independent of how they are represented in memory or on disk.
text = "café"
print(text)
print(len(text)) # 4
print(ord("é")) # 233 (Unicode code point)
Here, ord() returns the numeric code point for a character, while chr() does the reverse: turning a code point number back into a character.
print(chr(0x03C0)) # π
This separation between logical characters and their binary representation makes Python text handling simple and consistent across platforms. However, when text is read from or written to files, networks, or external systems, it must be converted to or from a specific encoding. Encoding defines how Unicode code points are stored as bytes.
Common encodings
The most widely used encoding today is UTF-8 (Unicode Transformation Format, 8-bit). It can represent every Unicode character using one to four bytes and is backward compatible with ASCII. Most Python environments, editors, and web standards use UTF-8 by default.
message = "Hello π"
data = message.encode("utf-8") # str → bytes
print(data) # b'Hello \xcf\x80'
decoded = data.decode("utf-8") # bytes → str
print(decoded) # Hello π
The encode() method converts a string to a bytes object using the chosen encoding, while decode() reverses the process. These conversions are essential when working with files, network sockets, or APIs that expect binary data.
encoding="utf-8" ensures predictable results everywhere:
with open("example.txt", "w", encoding="utf-8") as f:
f.write("naïve café")
Python’s file I/O automatically encodes and decodes text when an encoding is specified. If you omit it, Python uses the platform default, which can differ between operating systems.
Bytes versus strings
Bytes and strings may look similar but represent different kinds of data. A string (str) is a sequence of Unicode characters. A bytes object (bytes) is a sequence of raw 8-bit values. The two are distinct types and cannot be combined directly.
# invalid: b"data" + "text" → TypeError
To combine or compare them, you must convert one to the other using encode() or decode(). This explicitness avoids ambiguity about how text should be represented in binary form.
b = b"cafe"
print(b.decode("utf-8")) # cafe
Unicode escapes and special characters
Unicode characters can also be represented directly in string literals using escape sequences such as \u and \U. These sequences make it possible to include symbols that are not easily typed or visible.
greek = "\u03B1\u03B2\u03B3" # αβγ
emoji = "\U0001F600" # 😀
print(greek, emoji)
Python interprets these escapes at parse time, converting them into their proper Unicode characters. This ensures that strings behave consistently regardless of how they are entered.
Error handling during encoding
When encoding or decoding fails (usually because a byte sequence is invalid or a character cannot be represented) you can control the behaviour with the errors parameter. Common strategies include 'ignore', 'replace', and 'backslashreplace'.
text = "Spicy jalapeño"
print(text.encode("ascii", errors="ignore")) # b'Spicy jalapeno'
print(text.encode("ascii", errors="replace")) # b'Spicy jalape?o'
print(text.encode("ascii", errors="backslashreplace"))
# b'Spicy jalape\\xf1o'
Inspecting encodings
The sys module reveals the default system encoding used by Python. Although it is usually UTF-8 in modern versions, it can vary on older systems or specialised environments.
import sys
print(sys.getdefaultencoding()) # typically 'utf-8'
Understanding encodings ensures that text input and output behave consistently across files, terminals, and network interfaces. With Unicode at its core, Python lets you handle any form of written language or symbol safely and predictably.
Encoding and Unicode handling complete Python’s text model. Together with literals, slicing, and formatting, they allow programs to represent and manipulate human language accurately, from simple ASCII to the entire range of global writing systems and symbols.
Chapter 6: Flow Control
Every program is a story told through execution. Flow control determines how that story unfolds—what happens first, what happens next, and under what conditions different paths are taken. It defines the logic that drives computation from one line of code to another. Without flow control, a program would simply run from top to bottom without choice, repetition, or direction.
All programming languages provide ways to shape this flow. The most common are conditional statements (which choose between alternatives), loops (which repeat actions while a condition remains true), and control keywords (which alter or exit those structures). Together, they give programs their rhythm and responsiveness, allowing them to make decisions, react to data, and adapt to changing circumstances.
Python’s approach to flow control is clear and minimal. Where some languages use punctuation or braces, Python uses indentation to group related code blocks. This makes the structure of decisions and loops visible at a glance, and enforces consistency that prevents many common errors. The syntax reads almost like natural language, with keywords such as if, for, and while forming the backbone of logical expression.
This chapter explores using if, elif, and else to make decisions based on conditions, controlling repetition with for and while structures, and using break, continue, and pass to manage loop execution precisely. Also covered are expressing loops and conditional logic compactly in a single expression, and understanding how try, except, and finally provide structured handling of unexpected events. Together, these features let you write programs that think and react. Flow control is the difference between static code (instructions that merely exist) and living logic (instructions that decide).
Conditional constructs
Conditionals allow a program to make choices. They control which blocks of code run based on whether an expression evaluates to true or false. In Python, conditional logic is expressed with the keywords if, elif, and else. Each introduces a block of code whose execution depends on a Boolean test. This pattern is one of the cornerstones of all programming: it lets your code respond to data rather than follow a single fixed path.
The simplest form uses only if. If the condition evaluates to true, the indented block that follows is executed; otherwise, it is skipped.
temperature = 30
if temperature > 25:
print("It’s warm today.")
If the condition is false, nothing happens and execution continues after the block. You can extend the structure with an else clause to provide an alternative branch that runs when the condition is not met.
temperature = 18
if temperature >= 25:
print("It’s warm.")
else:
print("It’s cool.")
Python uses indentation to group statements, so there are no braces or keywords to mark where the blocks begin or end. Consistent indentation (two spaces in this book for narrow-screen readability) is essential, since it defines the structure of your logic directly.
elif chains
When you need to test multiple conditions in sequence, use elif (short for “else if”). Each test runs only if all previous conditions were false. The first condition that evaluates to true decides which block executes, and the rest are ignored.
score = 72
if score >= 90:
grade = "A"
elif score >= 80:
grade = "B"
elif score >= 70:
grade = "C"
else:
grade = "D"
print("Grade:", grade)
This structure reads naturally, and Python’s indentation makes the branching hierarchy visually clear. Only one block in the chain will ever run, even if later conditions would also be true.
Truth values and implicit tests
Any expression can serve as a condition, not just comparisons. Python determines truth or falsity using its truthiness rules. Nonzero numbers, nonempty sequences, and most objects evaluate as true; zero, empty collections, and None evaluate as false. This allows concise, readable conditions.
name = ""
if not name:
print("Name is missing.")
Here, the empty string counts as false, so the condition succeeds. You can use the same logic with lists, dictionaries, or any other container type.
Nesting and readability
Conditionals can be nested to handle more complex logic. However, deep nesting quickly reduces clarity, so it’s often better to combine conditions or use elif chains instead.
user = "admin"
logged_in = True
if logged_in:
if user == "admin":
print("Access granted.")
This can be written more clearly as:
if logged_in and user == "admin":
print("Access granted.")
and, or, not) instead of nesting multiple if statements. This keeps your control flow shallow and expressive.
Python encourages clarity over cleverness. Simple, flat logic is easier to read and maintain, especially when the indentation itself represents program structure.
Ternary expressions
Sometimes you need to choose between two values within a single line of code rather than writing a full if statement. Python provides a compact syntax for this called a conditional expression, often referred to as a ternary expression because it involves three parts: a value if the condition is true, the condition itself, and a value if the condition is false.
status = "open" if is_active else "closed"
This expression reads naturally: if is_active is true, status becomes "open"; otherwise, it becomes "closed". It performs the same logical test as a full if/else block, but it produces a value instead of controlling a block of statements.
# equivalent long form
if is_active:
status = "open"
else:
status = "closed"
condition ? true_value : false_value. Python’s version reverses the order for readability (true_value if condition else false_value) and treats it as an expression rather than a separate operator. This makes it read more like plain English.
Conditional expressions are especially useful inside assignments, return statements, and function arguments, where you want to select one of two results based on a simple condition without breaking the flow of the surrounding code.
def absolute_value(x):
return x if x >= 0 else -x
print(absolute_value(-5)) # 5
The structure of a ternary expression is:
value_if_true if condition else value_if_false
Only one of the two values is evaluated at runtime, depending on whether the condition is true or false. This makes it efficient and safe even when the expressions involve function calls or computations.
def get_data():
print("Fetching data...")
return [1, 2, 3]
use_cache = True
data = ["cached"] if use_cache else get_data()
print(data) # ['cached']
Here, get_data() is never called because the condition is true, so the expression short-circuits exactly like a normal if statement would.
if/elif/else block instead.
Pattern matching
Python 3.10 introduced structural pattern matching, a powerful new way to handle conditional logic. It extends the idea of if/elif branching by allowing entire data structures to be compared against patterns, not just single values. The match and case keywords together form a control structure that can dispatch logic based on both the shape and content of an object.
This idea is similar to switch statements found in languages such as C or JavaScript, but Python’s version is far more expressive. It lets you match against values, types, sequence layouts, and even partial structures. It is particularly useful for processing data that arrives in structured forms such as tuples, dictionaries, or objects.
Basic pattern matching
A simple match block checks a single subject value against multiple case patterns. The first pattern that matches is executed, and the rest are skipped.
command = "start"
match command:
case "start":
print("System starting...")
case "stop":
print("System stopping...")
case "pause":
print("System pausing...")
case _:
print("Unknown command.")
The underscore (_) acts as a wildcard pattern that matches anything, similar to a default branch in other languages. This ensures that all possibilities are handled, even unexpected ones.
Matching structures
Patterns can look inside compound objects such as tuples or lists. The number and arrangement of elements must match for the pattern to succeed.
point = (3, 4)
match point:
case (0, 0):
print("Origin")
case (x, 0):
print(f"On the x-axis at {x}")
case (0, y):
print(f"On the y-axis at {y}")
case (x, y):
print(f"Point at ({x}, {y})")
Here, variables like x and y capture corresponding values from the matched structure. Each successful match can both test and unpack data in a single step, reducing the need for nested if statements or separate assignments.
if/elif logic. For simple comparisons or short decisions, ordinary conditionals are clearer. Use match when you are selecting behaviour based on structured data, not just a single variable.
Guards (conditional matches)
You can refine a pattern with an additional condition by using the if keyword after a case. This is known as a guard, and it allows more specific tests once a pattern shape has matched.
number = 7
match number:
case n if n < 0:
print("Negative")
case n if n == 0:
print("Zero")
case n if n % 2 == 0:
print("Even")
case _:
print("Odd")
Guards are evaluated only if the pattern itself matches. This gives precise control over flow while keeping code compact and readable.
Matching data types and classes
Pattern matching can also inspect class instances by matching on their type and attributes. This works with any class that defines a suitable __match_args__ sequence or uses keyword patterns.
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
p = Point(1, 2)
match p:
case Point(x=0, y=0):
print("Origin")
case Point(x, y):
print(f"Coordinates: {x}, {y}")
case _:
print("Not a point")
When used carefully, pattern matching can replace long chains of if/elif statements with clear, data-driven logic. It becomes especially powerful when parsing structured input, responding to messages, or analysing trees of nested data such as JSON.
_) is special as it matches any value but does not bind it to a name. Always include it as a final case when a match may encounter unexpected input.
Loop control and keywords
Loops let a program repeat actions, but real programs often need to adjust how those loops behave. Python provides three simple keywords (pass, continue, and break) to give fine control over loop execution. These are small tools, but they make iteration predictable and expressive.
pass
pass is the simplest of the three. It does nothing at all. It exists as a placeholder for code that will be written later, or for a branch that intentionally performs no action. Python requires at least one statement inside an indented block, so pass is often used to satisfy that rule without changing behaviour.
for item in []:
pass # nothing to process yet
You can also use pass in function or class definitions when you want to outline structure before filling in details.
def placeholder_function():
pass
class Empty:
pass
Because pass executes cleanly and leaves no trace, it is useful during early development or when defining minimal blocks for later expansion.
continue
continue skips the rest of the current loop iteration and moves directly to the next cycle. Any statements that follow it within the loop body are ignored for that iteration.
for number in range(6):
if number % 2 == 0:
continue
print(number)
This loop prints only the odd numbers because continue causes Python to skip printing when the number is even. The loop itself keeps running, starting the next iteration immediately after the skip.
continue works the same way in while loops, letting you bypass certain cases without ending the loop entirely.
n = 0
while n < 5:
n += 1
if n == 3:
continue
print(n)
break
break exits the current loop entirely, regardless of where the loop counter stands. Control passes to the first statement following the loop.
for number in range(10):
if number == 5:
break
print(number)
Here, the loop stops as soon as number reaches five, printing only values 0 through 4. Once a break executes, the loop condition is no longer checked.
In nested loops, break only exits the innermost one. If you need to exit multiple levels, structure the code so that the outer loop’s condition also becomes false or raise an exception to signal a stop.
Using else with loops
Python allows a unique but often overlooked feature: an else clause attached to a loop. The else block runs only if the loop finishes normally without hitting a break statement.
for n in range(5):
if n == 3:
break
else:
print("Loop completed without break.")
In this example, the else block is skipped because the loop was terminated by break. This construct is most useful when searching for a value. If the loop ends without finding it, the else section can handle the “not found” case.
for item in [1, 2, 3]:
if item == 4:
print("Found it.")
break
else:
print("Not found.")
break for early exit when the goal of a loop has been reached, continue to skip specific cases, and pass when a block must exist syntactically but should do nothing.
These three small keywords give you precision control over repetition. They allow loops to remain simple yet flexible, letting your code handle both expected and exceptional situations clearly and directly.
Basic exception handling
Even well-written programs encounter problems; files may be missing, input may be invalid, or a calculation may fail. In Python, such situations raise exceptions, special objects that signal that something went wrong during execution. If not handled, an exception stops the program immediately and prints a traceback describing the error. To prevent abrupt termination, Python provides structured tools to detect and respond to these conditions cleanly.
The try and except structure
The simplest way to handle an exception is with a try and except block. The statements inside the try section are executed normally. If an error occurs, control jumps to the except block instead of halting the program.
try:
result = 10 / 0
print("This line will not run.")
except ZeroDivisionError:
print("Cannot divide by zero.")
When the division by zero triggers an exception, the program continues gracefully inside the except section. Without this structure, the program would have crashed. Only exceptions that match the named type (in this case ZeroDivisionError) are caught; others continue to propagate upward.
Catching multiple exceptions
You can handle several possible error types either by listing them in parentheses or by stacking multiple except blocks. This lets you respond appropriately to different failure modes.
try:
value = int(input("Enter a number: "))
result = 10 / value
except (ValueError, ZeroDivisionError):
print("Invalid input or division by zero.")
Alternatively, you can handle each case separately:
try:
value = int(input("Enter a number: "))
result = 10 / value
except ValueError:
print("You must enter a valid number.")
except ZeroDivisionError:
print("You cannot divide by zero.")
This structure keeps your logic clear and targeted. Each except block addresses a specific risk, improving both readability and maintainability.
Using else and finally
A try block can include optional else and finally clauses. The else section runs only if no exception was raised, while finally always runs, whether or not an error occurred. These are useful for code that should execute after a safe operation, or for cleanup tasks that must happen regardless of outcome.
try:
f = open("data.txt")
content = f.read()
except FileNotFoundError:
print("File not found.")
else:
print("File read successfully.")
finally:
f.close()
In this example, finally ensures that the file is closed even if an error happens. This guarantees that external resources such as files or network connections are released properly.
General exception catching
To catch any kind of exception, you can use a bare except clause, though this should be reserved for situations where any error must be caught. More commonly, the built-in Exception base class is used, which covers most runtime errors without intercepting system-level events.
try:
risky_operation()
except Exception as e:
print("Something went wrong:", e)
Here, e holds the exception object itself, allowing you to print or log details about what failed. Catching all exceptions can be convenient, but it can also hide unexpected problems, so use it sparingly.
except: without specifying the type. It can catch system signals and interrupt events, making programs harder to stop or debug. Always catch the narrowest category of errors that makes sense.
Why exception handling matters
Exception handling separates normal logic from error recovery, keeping code readable and predictable. It lets you deal with problems as part of the program’s flow instead of as catastrophic failures. This preview shows only the basics. Later in this book we will explore Python’s full exception model; how exceptions are defined, raised, chained, and managed in complex systems.
Handled properly, exceptions turn fragile programs into resilient ones. They allow software to fail gracefully, communicate clearly, and continue operating safely under unexpected conditions.
Chapter 7: Looping Constructs
In every programming language, looping is how we make the computer repeat work. It’s the bridge between static logic and dynamic behaviour. Where conditionals let a program make choices, loops let it persist by cycling through data, retrying tasks, or progressing until some condition is met. Without looping, every computation would end after a single step.
Python’s looping mechanisms are concise, expressive, and integrated tightly with the language’s object model. Instead of relying on counters and manual iteration, Python treats looping as a conversation between an iterable object and a loop construct. This design makes loops both powerful and readable, reflecting Python’s guiding philosophy that code should be clear about what it is doing, not just how.
Two primary forms of looping exist in Python: the for loop, which iterates directly over sequences and other iterable objects, and the while loop, which continues as long as a given condition remains true. Both use indentation to define their controlled blocks and can be combined with optional else clauses for post-loop actions. Supporting statements such as break, continue, and pass provide fine-grained control over loop execution.
In this chapter we explore these constructs in depth: how iteration works under the hood, how Python’s iterator protocol makes loops flexible and memory-efficient, and how to write clear looping logic that avoids off-by-one errors and infinite cycles. We’ll also look at practical examples, from processing lists and dictionaries to reading files and using comprehensions as compact looping expressions.
for loops don’t need explicit counters unless you want them. They iterate directly over data, which makes most loops cleaner and less error-prone than traditional for–index patterns found in languages like C or Java.
for and while loops
Python provides two main looping constructs: the for loop and the while loop. Between them they cover nearly every kind of repetition a program might need. The for loop is typically used when you want to iterate over a sequence or other iterable object. The while loop is used when you want to repeat an action until a condition changes.
The for loop
The for loop is one of Python’s most distinctive features. Unlike many languages that use loops controlled by counters, Python’s for loop iterates directly over the items of an iterable object such as a list, tuple, string, dictionary, or range. This design makes loops easier to read and avoids many off-by-one errors common in other languages.
fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
print(fruit)
This loop reads as if written in plain English: “for each fruit in fruits, print it.” The variable fruit takes each value from the iterable in turn until the sequence is exhausted.
Any object that supports iteration can be used in a for loop. This includes ranges, which are efficient representations of integer sequences:
for i in range(3):
print("Iteration", i)
The range() function produces integers starting from zero by default. You can supply a start, stop, and step value as in range(start, stop, step) to control iteration precisely:
for n in range(2, 10, 2):
print(n)
When looping over dictionaries, the for statement iterates through keys by default. You can use the .items() method to access both keys and values together:
person = {"name": "Ada", "age": 36}
for key, value in person.items():
print(key, "→", value)
for loops are powered by the iterator protocol. Under the surface, Python calls iter() on the object, then repeatedly calls next() until a StopIteration exception signals the end. Understanding this mechanism helps explain why loops work with lists, files, generators, and many other objects.
The while loop
The while loop repeats a block of code as long as its condition remains true. It is ideal when you don’t know in advance how many iterations are needed, or when you want to keep looping until some external event occurs.
count = 0
while count < 3:
print("Count:", count)
count += 1
Each time the condition is checked, and if it is still true, the loop body runs again. When the condition becomes false, control moves to the next statement after the loop. If the condition never becomes false, the loop will run forever, so care must be taken to ensure that something inside the loop changes the state being tested. The condition does not need to be a simple comparison. Any expression that evaluates to a Boolean value can be used. For example, this loop continues until the user provides an empty input:
while True:
text = input("Enter text (or press Enter to quit): ")
if not text:
break
print("You entered:", text)
while loop never becomes false. Always make sure that something inside the loop modifies the variables or state being tested. Using break can also provide a clean way to exit when a specific condition is met.
Both for and while loops can include an optional else block that runs only if the loop completes normally (without hitting a break). This is a unique Python feature that can simplify certain patterns, such as searching for an item:
for n in range(5):
if n == 3:
print("Found it!")
break
else:
print("Not found")
In this example, the else block runs only if the loop ends without a break. Although not common, it can make intent clear when a loop is performing a search or check.
These two loop types form the basis of all repetition in Python. The for loop handles most cases elegantly by working directly with iterables, while the while loop remains the flexible fallback for open-ended repetition based on conditions. Together they give Python its clear, natural rhythm of iteration and control.
range() and iteration patterns
The range() function is one of Python’s most widely used tools for creating sequences of integers to drive iteration. Although it looks simple, it is implemented in a way that is both memory-efficient and flexible. A range object represents a sequence of numbers without storing them all in memory. Instead, it generates each value on demand as the loop progresses.
for i in range(5):
print(i)
This example prints numbers from 0 up to, but not including, 5. By default, range() starts at 0 and increases by 1 each time. You can also supply explicit start and step values:
for i in range(2, 10, 2):
print(i)
The above produces 2, 4, 6, and 8. Negative steps work too, allowing you to count backwards:
for i in range(10, 0, -3):
print(i)
Because range produces an immutable sequence type, it can also be converted to a list or tuple when you need to see all its values at once:
nums = list(range(5))
print(nums) # [0, 1, 2, 3, 4]
range is evaluated lazily. It calculates values only when needed, which means even a very large range such as range(1_000_000) uses almost no memory until iterated.
Enumerating items
When you need both the index and the value of each item in a sequence, the built-in enumerate() function provides a clean solution. Instead of managing a manual counter, enumerate() wraps any iterable and yields pairs of index and item.
fruits = ["apple", "banana", "cherry"]
for index, fruit in enumerate(fruits):
print(index, fruit)
You can also specify a starting index with a second argument:
for index, fruit in enumerate(fruits, start=1):
print(index, fruit)
This pattern is clearer and safer than incrementing a counter variable inside the loop, since Python automatically handles the index progression.
Looping over multiple sequences
Sometimes you need to iterate over two or more sequences in parallel. The zip() function combines multiple iterables into tuples, yielding one element from each at a time. The iteration stops when the shortest sequence is exhausted.
names = ["Ada", "Grace", "Linus"]
languages = ["Python", "C", "Linux"]
for name, project in zip(names, languages):
print(name, "→", project)
If the sequences differ in length, the extra elements in the longer ones are ignored. For strict pairing where unequal lengths should raise an error, use itertools.zip_longest() from the standard library.
Iterating in reverse or sorted order
Two built-in functions make it easy to control iteration order. reversed() returns an iterator that yields items in reverse sequence, and sorted() produces a new list of items in ascending or custom order.
for n in reversed(range(3)):
print(n)
letters = ["b", "c", "a"]
for l in sorted(letters):
print(l)
Both of these functions leave the original sequence unchanged. reversed() works on any object that supports the sequence protocol, while sorted() can take a key argument to customise sorting logic.
Comprehension loops
Many loop patterns can be expressed more succinctly using comprehensions. A list comprehension creates a new list by looping through an iterable and applying an expression to each element.
numbers = [1, 2, 3, 4]
squares = [n * n for n in numbers]
print(squares) # [1, 4, 9, 16]
Comprehensions are not just shorthand—they are also slightly faster because they run in C-level code internally. The same concept applies to set and dictionary comprehensions, and to generator expressions that produce values lazily.
for loop is often easier to read and maintain.
Loop else clauses
Python’s for and while loops both support an optional else clause. This feature is unique to Python and often surprises those coming from other languages. The else block runs only if the loop finishes normally, that is, without encountering a break statement. It does not run when the loop exits early due to break, an exception, or program termination.
for n in range(5):
if n == 3:
print("Found it!")
break
else:
print("Not found")
In this example, the else block would only execute if the loop completed all iterations without the break. Because n == 3 triggers the break, the else block is skipped. This pattern is especially useful when searching for something within a collection and needing to know afterward whether it was found.
The same logic applies to while loops:
count = 0
while count < 3:
if count == 2:
print("Stopping early")
break
count += 1
else:
print("Loop completed normally")
Here, the else runs only if the condition becomes false without hitting a break. If break executes, the else is skipped. This distinction lets you write clear, single-loop structures that combine iteration and post-check logic without additional flags or variables.
else as an “if not broken” clause. It runs when the loop ends naturally and skips when the loop is interrupted by break.
Although it can make code more concise, the loop else is best used in situations where the behaviour is clearly understood. Overusing it in complex loops can make logic harder to follow. When used with purpose (especially in search patterns or validation checks) it keeps code elegant and eliminates the need for additional state tracking.
names = ["Ada", "Grace", "Linus"]
for name in names:
if name == "Turing":
print("Found Turing!")
break
else:
print("Turing not found")
Here the loop runs through every name. Because no break occurs, the else executes and reports that “Turing not found.” This simple structure communicates intent cleanly and is a hallmark of Python’s design: readable logic expressed in natural order.
Iteration best practices
Good looping style in Python is about clarity and intention. Because the language offers many ways to express repetition, it helps to choose patterns that reveal meaning at a glance. The goal is always to make loops readable, efficient, and free from unnecessary state or side effects.
Iterate over data, not indices
Whenever possible, loop directly over the items you need rather than using numeric indices. This avoids manual bookkeeping and makes the loop easier to read.
# Preferred
for fruit in fruits:
print(fruit)
# Avoid unless you really need the index
for i in range(len(fruits)):
print(fruits[i])
If both index and value are required, use enumerate() rather than managing a counter variable yourself.
Use comprehensions for simple transformations
List, set, and dictionary comprehensions are compact forms of iteration that build new collections in a single readable expression. They should be used when the logic fits neatly into one line.
squares = [n * n for n in range(10)]
even_squares = [n for n in squares if n % 2 == 0]
For more complex logic or multiple nested conditions, a traditional loop is usually clearer.
Avoid modifying a collection while iterating
Changing a list or dictionary as you iterate over it can cause skipped items or unexpected behaviour. Instead, create a new collection or iterate over a copy.
# Problematic
for n in numbers:
if n % 2 == 0:
numbers.remove(n)
# Safer
numbers = [n for n in numbers if n % 2 != 0]
Prefer built-in functions over manual loops
Many looping tasks can be expressed more clearly using built-ins such as sum(), any(), all(), max(), and min(). These communicate intent directly and often run faster because they are implemented in C.
# Instead of this
total = 0
for n in numbers:
total += n
# Use this
total = sum(numbers)
Be explicit with termination
Always make it clear when and why a loop ends. Avoid deeply nested if and break combinations if a clearer condition or early return would do. When using while, ensure that the loop variable or state changes in a predictable way to prevent infinite loops.
Choose readability over cleverness
Python rewards expressive but straightforward looping. Avoid overly condensed expressions or obscure iterator tricks when a simple loop is clearer. Idiomatic Python code prioritises communication—loops should read almost like plain English descriptions of what the program is doing.
Following these guidelines keeps loops predictable, maintainable, and aligned with Python’s philosophy of readability and clarity. With practice, iteration becomes not just a means of repetition but a rhythm of thought that shapes elegant, expressive programs.
Chapter 8: Functions and Functional Features
Functions are the building blocks of structure and reuse in Python. They let you group statements into named units, turning repeated logic into single, clear definitions. A good function captures one idea, performs one task, and hides unnecessary detail behind a clean interface. This modularity is what turns a script into a program and a collection of programs into a coherent system.
In Python, functions are not only organisational tools but also objects in their own right. They can be passed as arguments, returned from other functions, stored in data structures, and created dynamically. This flexibility comes from Python’s functional heritage, where functions are treated as first-class citizens—a concept that opens up powerful and expressive programming patterns.
This chapter explores how functions work in Python and how they connect to the broader functional features of the language. We’ll cover definition and calling syntax, parameters and return values, scope and closures, anonymous functions with lambda, higher-order functions, and the role of decorators in extending behaviour. Along the way, you’ll see how Python’s treatment of functions as objects allows for patterns that are both elegant and deeply practical.
Defining and calling functions
Every function in Python begins with the def keyword, followed by the function’s name, a list of parameters in parentheses, and a colon. The body of the function is indented and contains the statements that define what the function does. When the function is called, Python executes those statements and optionally returns a value with the return statement.
def greet(name):
print("Hello,", name)
Here, greet is a function that takes a single argument, name. When called, it prints a greeting. The function itself does nothing until invoked by name:
greet("Ada")
greet("Grace")
Function names follow the same rules as variables: they must begin with a letter or underscore, contain only letters, digits, and underscores, and are case-sensitive. By convention, Python functions use lowercase words separated by underscores (snake_case).
Returning values
Functions often compute and return a result. The return statement ends the function and passes a value back to the caller. If no return is given, Python automatically returns None.
def add(a, b):
return a + b
result = add(5, 3)
print(result) # 8
A function can return any Python object, including lists, tuples, dictionaries, or even other functions. Returning multiple values is simply done by separating them with commas, which creates a tuple implicitly.
def divide(x, y):
quotient = x // y
remainder = x % y
return quotient, remainder
q, r = divide(10, 3)
print(q, r) # 3 1
Default and optional arguments
Parameters can have default values, making them optional when calling the function. Defaults are evaluated once when the function is defined, not each time it is called.
def greet(name, greeting="Hello"):
print(greeting, name)
greet("Ada")
greet("Grace", "Hi")
Defaults can be of any type, but mutable defaults (like lists or dictionaries) should be avoided because they persist across calls. Use None as a safe default if you need to create a new object each time.
def append_item(item, lst=None):
if lst is None:
lst = []
lst.append(item)
return lst
Keyword and positional arguments
Arguments in Python can be passed by position or by keyword. Positional arguments match parameters in order, while keyword arguments specify them explicitly. You can mix both styles as long as positional arguments come first.
def describe(person, age, city):
print(f"{person} is {age} and lives in {city}.")
describe("Ada", 36, "London")
describe(person="Grace", age=35, city="New York")
describe("Linus", city="Helsinki", age=33)
This flexibility makes code easier to read and maintain. It also helps prevent errors when a function takes several parameters of similar types.
Docstrings and function help
A docstring (short for documentation string) is a string literal placed immediately after a function definition. It describes what the function does, its parameters, and its return value. Python stores this text as the function’s .__doc__ attribute and displays it when using the help() function.
def square(n):
"""Return the square of n."""
return n * n
help(square)
Clear docstrings turn your functions into self-documenting components, making code easier to use, share, and maintain. Well-written docstrings follow the same conventions as those in the standard library: a short summary on the first line, optionally followed by more detail or examples.
return values are the output, and the docstring is the agreement that explains what happens in between.
By defining and calling functions clearly, you give structure to your code, make behaviour reusable, and create natural building blocks for larger systems. Every good Python program grows from small, well-defined functions that each do one thing well.
Positional, keyword, and default parameters
Python gives you several ways to pass data into functions, allowing for a mix of flexibility and clarity. Function parameters can be positional, keyword-based, or have default values, and all three styles can coexist in a single definition. Understanding how these work together is key to writing functions that are both powerful and easy to call.
Positional parameters
By default, arguments in a function call are matched to parameters by their order. These are known as positional parameters. The first argument corresponds to the first parameter, the second to the second, and so on.
def move(x, y):
print(f"Moving to coordinates ({x}, {y})")
move(10, 20)
Positional arguments are compact and familiar, but can become confusing when there are many parameters or when the order is easy to forget. This is where keyword arguments improve readability.
Keyword arguments
A keyword argument specifies which parameter a value is meant for, regardless of order. You write the parameter name followed by an equals sign and the value. Keyword arguments make function calls self-documenting, which is particularly useful when functions have many optional or similarly typed parameters.
def connect(host, port, secure):
print(f"Connecting to {host}:{port} (secure={secure})")
connect("example.com", 443, True) # positional
connect(port=443, host="example.com", secure=True) # keyword
When combining positional and keyword arguments, positional ones must always come first. This rule avoids ambiguity and keeps function calls consistent.
connect("example.com", port=443, secure=True) # valid
connect(port=443, "example.com", secure=True) # invalid
Default parameters
Parameters can be given default values, which makes them optional when calling the function. If the caller omits a value, Python uses the default instead.
def greet(name, greeting="Hello"):
print(f"{greeting}, {name}!")
greet("Ada")
greet("Grace", "Hi")
Defaults are evaluated once at the time the function is defined, not every time it runs. This means that mutable default values like lists or dictionaries persist across calls, which can lead to subtle bugs.
def add_item(item, collection=[]):
collection.append(item)
return collection
print(add_item("apple"))
print(add_item("banana")) # reuses the same list
The second call here adds to the same list as the first, since the default list is shared. To avoid this, use None as a placeholder and create a new list inside the function.
def add_item(item, collection=None):
if collection is None:
collection = []
collection.append(item)
return collection
Combining styles
Functions often use a mix of positional, keyword, and default parameters to balance flexibility with clarity. For instance, positional parameters might define essential inputs, while keyword and default parameters handle optional behaviour.
def send_message(to, subject, body="", urgent=False):
print(f"To: {to}")
print(f"Subject: {subject}")
if urgent:
print("⚠️ URGENT")
print(body)
send_message("team@example.com", "Meeting", urgent=True)
This design lets the caller specify what matters most, while relying on defaults for the rest. It’s a clear, Pythonic way to make functions flexible without making them hard to read.
Variable-length arguments
Sometimes you don’t know in advance how many arguments a function should accept. Python solves this with variable-length arguments, which allow a function to receive an arbitrary number of positional or keyword parameters. This makes functions highly flexible and adaptable to different calling patterns.
*args: variable positional arguments
When you prefix a parameter with an asterisk (*), Python collects any extra positional arguments into a tuple. This allows the function to handle any number of values without needing to define them individually.
def total(*numbers):
print(numbers)
return sum(numbers)
print(total(2, 4, 6))
print(total(1, 3, 5, 7, 9))
Here, all supplied arguments are packed into the tuple numbers. The function can then iterate over them as needed. This technique is especially useful when building wrapper functions or utilities that forward arguments to other functions.
**kwargs: variable keyword arguments
Similarly, prefixing a parameter with two asterisks (**) collects any additional keyword arguments into a dictionary. This lets you capture named arguments that were not explicitly defined in the function’s signature.
def describe(**info):
for key, value in info.items():
print(f"{key}: {value}")
describe(name="Ada", job="Engineer", city="London")
Inside the function, info is a dictionary containing all the keyword arguments passed in. This pattern is common when functions need to accept flexible configuration data or pass arbitrary options along to another function.
Combining fixed and variable parameters
You can mix normal parameters with *args and **kwargs in one definition. The general order is:
- Positional or keyword parameters
*args- Keyword-only parameters (optional)
**kwargs
def report(title, *items, **details):
print("Report:", title)
for item in items:
print("-", item)
for key, value in details.items():
print(f"{key}: {value}")
report("Inventory", "Apples", "Oranges", total=42, checked=True)
This prints a short list of items, followed by key–value details. The function can accept any number of extra items and additional details, making it extremely flexible while keeping a clear base structure.
Argument unpacking
The * and ** symbols can also be used in the opposite direction to unpack sequences or dictionaries when calling a function. This spreads their contents into separate arguments automatically.
def greet(first, last):
print(f"Hello, {first} {last}!")
args = ("Ada", "Lovelace")
kwargs = {"first": "Grace", "last": "Hopper"}
greet(*args)
greet(**kwargs)
Unpacking is an elegant way to forward arguments dynamically or combine values from different sources. It’s also a key part of many Python APIs, especially in frameworks that build functions programmatically.
*args and **kwargs, remember that they collect all unmatched arguments. Misordering parameters or misspelling a keyword can silently redirect values into the wrong place, so always check your function signatures carefully.
Scope, closures, and global/nonlocal
Every name in Python lives inside a scope, which defines where that name can be accessed. Scopes create layers of visibility, ensuring that variables defined inside one part of a program do not unintentionally affect others. Understanding how Python manages scope is essential for writing reliable and maintainable code.
The LEGB rule
When Python looks up a name, it follows a specific order known as the LEGB rule — Local, Enclosing, Global, and Built-in. Python searches through these scopes in sequence until it finds the name being referenced.
- Local: names created inside the current function
- Enclosing: names in any outer (but not global) functions that surround the current one
- Global: names defined at the top level of the current module or declared global
- Built-in: names provided by Python itself, such as
lenorrange
x = "global"
def outer():
x = "enclosing"
def inner():
x = "local"
print(x)
inner()
outer()
When inner() prints x, it finds the local version first. If it were missing, Python would look outward through the enclosing, global, and finally built-in scopes until a match was found.
Global variables
Names defined at the top level of a module live in the global scope. They can be read from anywhere, but to modify them inside a function you must declare them as global. Without this declaration, Python assumes any assignment creates a new local variable instead of changing the global one.
counter = 0
def increment():
global counter
counter += 1
increment()
print(counter) # 1
Using global variables can make code harder to reason about because changes in one part of a program can unexpectedly affect another. They should be used sparingly, typically for constants or shared configuration.
Closures and enclosing scopes
A closure occurs when a nested function captures and remembers variables from its enclosing scope, even after that outer function has finished executing. Closures are a powerful feature that enable encapsulation and function factories.
def make_multiplier(factor):
def multiply(n):
return n * factor
return multiply
times3 = make_multiplier(3)
print(times3(5)) # 15
Here, multiply() retains access to factor even though make_multiplier() has already returned. The variable is stored in the closure, not recreated each time the inner function runs. You can inspect a function’s closure via its __closure__ attribute.
The nonlocal keyword
When working with nested functions, you may want to modify a variable from an enclosing (but not global) scope. The nonlocal keyword tells Python that a name refers to one in the nearest enclosing scope, not a new local one.
def counter():
count = 0
def increment():
nonlocal count
count += 1
return count
return increment
c = counter()
print(c()) # 1
print(c()) # 2
Without the nonlocal declaration, the count += 1 line would create a new local variable inside increment() instead of updating the one in the enclosing scope. The nonlocal keyword ensures that nested functions can modify variables from their outer lexical context safely and predictably.
global and nonlocal unless necessary. Excessive use can make code difficult to follow by introducing hidden state changes across scopes. When possible, return updated values explicitly or use classes to manage state more transparently.
Lambda expressions
Python’s lambda expressions provide a compact way to define small, anonymous functions. They are useful when a simple function is needed for a short period, especially as an argument to another function such as map(), filter(), or sorted(). A lambda is syntactically limited to a single expression, but it can capture variables from its surrounding scope just like a regular function.
square = lambda x: x * x
print(square(5)) # 25
This defines a function that takes one argument (x) and returns its square. The lambda keyword is followed by parameters, a colon, and the expression whose result will be returned. There is no need for return because the value of the expression is returned automatically.
Using lambda with built-in functions
Lambda functions are most commonly used where a short, throwaway function is required. For example, you can provide a sorting key directly within a call to sorted():
words = ["banana", "apple", "cherry"]
sorted_words = sorted(words, key=lambda w: len(w))
print(sorted_words) # ['apple', 'banana', 'cherry']
Here, the lambda function takes each element w and returns its length, which sorted() uses to order the list. The same approach works with other higher-order functions that expect a callable argument.
numbers = [1, 2, 3, 4, 5, 6]
even = list(filter(lambda n: n % 2 == 0, numbers))
print(even) # [2, 4, 6]
Lambda vs. def
Every lambda can be written as a normal function using def. The difference is only in form, not in behaviour. The lambda style is concise but best reserved for very simple operations. For anything longer or more descriptive, a named function improves clarity.
# Using def
def cube(x):
return x ** 3
# Equivalent lambda
cube = lambda x: x ** 3
Lambdas are anonymous functions. They do not have a name unless assigned to one, and they do not support statements, annotations, or multiple expressions. This makes them ideal for quick, inline operations, not for complex logic.
lambda can make code difficult to read. If the expression grows beyond a single simple idea, use a regular def function instead. Readability should always take priority over brevity.
Closures in lambda
Like any other function, a lambda can form a closure by capturing values from the surrounding scope. This makes it a concise way to create small, customised functions dynamically.
def make_incrementer(step):
return lambda n: n + step
inc5 = make_incrementer(5)
print(inc5(10)) # 15
Here, the lambda expression remembers the value of step from the enclosing scope, just as an inner function would. This combination of conciseness and lexical scoping makes lambda expressions powerful tools for functional programming patterns.
In practice, lambda expressions should be used to express simple, self-contained operations. They work best when their intent is immediately clear, such as transforming data, defining quick sorting keys, or building small closures inline.
Type hints and annotations
Type hints let you describe the expected types of function parameters and return values without changing how the program runs. They were introduced in Python 3.5 as part of the typing system and are now a standard feature of modern Python development. Type hints are optional but strongly encouraged in professional codebases because they improve readability, make intent explicit, and enable static analysis tools to catch errors before runtime.
def add(x: int, y: int) -> int:
return x + y
In this example, x and y are expected to be integers, and the function should return an integer. The hints do not affect execution (Python does not enforce them) but tools such as mypy, pyright, and modern IDEs use them to provide intelligent feedback and error detection.
Annotating parameters and return values
Type hints are written after each parameter name, separated by a colon, and the return type is marked after an arrow (->) following the parameter list. Any valid Python expression can appear in an annotation, though most commonly these come from the typing module.
def greet(name: str, times: int = 1) -> None:
for _ in range(times):
print(f"Hello, {name}!")
Here, name is expected to be a string, times an integer with a default value, and the function returns nothing (None). The annotations are available at runtime through the function’s __annotations__ attribute:
print(greet.__annotations__)
# {'name': <class 'str'>, 'times': <class 'int'>, 'return': <class 'NoneType'>}
Common types and collections
The typing module provides standard type objects for collections and generics, letting you describe the expected contents of lists, dictionaries, and other containers.
from typing import List, Dict, Tuple
def process(scores: List[int]) -> Dict[str, float]:
return {"average": sum(scores) / len(scores)}
def coordinate() -> Tuple[int, int]:
return (10, 20)
For newer versions of Python (3.9+), you can use the built-in generic syntax instead of importing from typing:
def process(scores: list[int]) -> dict[str, float]:
return {"average": sum(scores) / len(scores)}
Optional, Union, and Any
Sometimes a parameter may accept more than one type, or may be optional. The Union and Optional helpers express these cases clearly.
from typing import Union, Optional
def parse(value: Union[int, str]) -> str:
return str(value)
def read_name(name: Optional[str] = None) -> str:
return name or "Anonymous"
Union[int, str] means the argument can be either type. Optional[str] is shorthand for Union[str, None], showing that the value may be missing or set to None.
Callable and TypeVar
You can also specify functions as arguments using Callable, or define generic types using TypeVar. These are common when writing decorators or higher-order functions.
from typing import Callable, TypeVar
T = TypeVar("T")
def apply_twice(func: Callable[[T], T], value: T) -> T:
return func(func(value))
print(apply_twice(lambda n: n * 2, 3)) # 12
Here, apply_twice() works with any function that takes and returns the same type, demonstrating how type variables express general relationships between inputs and outputs.
Type checking and benefits
Although type hints are not enforced at runtime, they greatly enhance static checking, documentation, and tooling. Editors can autocomplete more accurately, catch mismatches early, and generate cleaner API documentation. They also make code easier for humans to read by showing what kind of data each function expects and produces.
Chapter 9: Python Modules and Packages
As programs grow, organisation becomes as important as syntax. Python’s answer is the module system. A module is a single .py file that defines names (functions, classes, variables) inside its own namespace. A package is a directory that groups related modules together so you can build clear, reusable libraries rather than a single monolithic script.
Importing brings names from one module into another. The import machinery locates a module (or package), compiles it to bytecode if needed, executes it once to create its globals, then caches the result in sys.modules. Later imports reuse that cached module. This behaviour is simple to use in small scripts, yet it scales to large codebases because each file has a clean boundary and a predictable namespace.
Packages let you layer structure. You can arrange modules into folders that mirror the concepts of your project, then import with dotted paths. Since Python 3.3, namespace packages also allow package directories without an __init__.py file (useful when splitting a package across multiple locations). Traditional packages still use __init__.py to initialise package-level behaviour or expose selected symbols for import.
Import mechanics and aliasing
The import statement is one of Python’s most important features. It lets you bring external code into the current namespace so that you can reuse functionality without rewriting it. When Python encounters an import, it searches through a list of directories defined in sys.path, which includes the current working directory, the standard library, and any site-packages directories for installed modules.
import math
print(math.sqrt(16)) # 4.0
Once imported, a module is executed only once per session. After that, the interpreter retrieves it from the cache stored in sys.modules. This means that repeated imports are efficient and fast. If you need to reload a module after modifying it interactively, you can use importlib.reload().
import importlib
importlib.reload(math)
Python also supports importing with an alias using the as keyword. This is a simple yet powerful way to shorten long module names or clarify context in your code.
import numpy as np
import datetime as dt
import pandas as pd or import matplotlib.pyplot as plt. Following community norms helps readability for anyone familiar with those libraries.
You can also import specific names from a module to avoid repeating the module prefix each time you use a function or class:
from math import sqrt, pi
print(sqrt(9) * pi)
from module import * except in interactive sessions. It pollutes the namespace with unknown names and makes code harder to read and debug.
Understanding how imports resolve, execute, and cache helps prevent confusion when working with multiple files. It ensures that your codebase remains modular, efficient, and predictable.
The module search path
When Python imports a module, it follows a well-defined search process looking through a sequence of directories known as the module search path. This path is stored in the list sys.path and is initialised each time the interpreter starts.
Understanding the search path is crucial when structuring projects or debugging import errors. By knowing where Python looks (and in what order) you can predict and control how your modules are found and loaded. The typical search order is:
- The directory containing the script that was run (or the current directory in interactive mode)
- The standard library directories
- Site-specific directories such as
site-packages, where third-party modules are installed
import sys
for path in sys.path:
print(path)
sys.path when experimenting, but in production it is better to manage module locations using environment variables like PYTHONPATH or proper package installation tools such as pip.
This list is just an ordinary Python list, so you can inspect or modify it at runtime. Appending new directories lets you import modules from custom locations, though this should be done with care.
import sys
sys.path.append('/path/to/my/modules')
random.py could override the built-in random module).
Creating your own modules
Any Python file can act as a module. If you save a script as mymodule.py and import it from another file, Python will execute the code once and make its functions, classes, and variables available through the module’s namespace. This is the simplest and most direct way to structure reusable code.
# mymodule.py
def greet(name):
return f"Hello, {name}!"
# main.py
import mymodule
print(mymodule.greet("Ada"))
When the import statement runs, Python creates a module object and stores it in sys.modules. The module’s global variables become attributes of that object. This means you can inspect or even modify them dynamically.
print(dir(mymodule))
print(mymodule.__name__)
.py extension. Keeping module names short, descriptive, and lowercase (such as utils.py or helpers.py) helps readability and follows PEP 8 conventions.
Modules can contain any combination of top-level functions, classes, and variables. However, code at the top level runs as soon as the module is imported. For scripts that can also be run directly, you can add a special block to distinguish those cases:
if __name__ == "__main__":
print(greet("world"))
if __name__ == "__main__" ensures that certain code runs only when the file is executed directly, not when imported. This pattern keeps your modules reusable while still allowing standalone testing.
Package structure and __init__.py
When you outgrow single files, you can group related modules into a directory. This directory, known as a package, must either contain a special file named __init__.py or be recognised as a namespace package (available since Python 3.3). Packages help you organise functionality into layers, keeping large projects manageable and logical.
A basic package might look like this:
shapes/
__init__.py
circle.py
square.py
After creating this structure, you can import from it using dotted paths:
import shapes.circle
from shapes.square import area
The __init__.py file is executed when the package is imported. It can be empty, or it can initialise variables, set up imports, or define what symbols should be exposed when someone uses from package import *.
# shapes/__init__.py
from .circle import area as circle_area
from .square import area as square_area
__all__ inside __init__.py to explicitly declare which names are public. This keeps your package interface clean and avoids unintentionally exposing helper functions or constants.
# shapes/__init__.py
__all__ = ["circle_area", "square_area"]
Namespace packages, introduced in PEP 420, are directories without an __init__.py file. They allow a single package to be spread across multiple locations. This is especially useful in large ecosystems or modular plugin systems.
The Python standard library
Python ships with a rich collection of modules known as the standard library. These modules cover everyday tasks (file handling, text processing, dates and times, data formats, networking, concurrency, testing) so you can build useful programs without installing anything else. The standard library is designed to be consistent and well documented, and it follows the language’s philosophy of clarity and practicality.
You load a module with import and then call its functions or use its classes. Each import adds focused capabilities without adding external dependencies.
import math
from pathlib import Path
print(math.sqrt(49)) # 7.0
p = Path("notes.txt")
print(p.exists()) # True or False
Files, paths, and the operating system
pathlib(object oriented paths across platforms)os,os.path,shutil(environment, process, and filesystem utilities)glob,fnmatch(filename pattern matching)tempfile,io(temporary files and file-like interfaces)
from pathlib import Path
import shutil
src = Path("images")
dst = Path("backup/images")
dst.mkdir(parents=True, exist_ok=True)
for img in src.glob("*.png"):
shutil.copy2(img, dst / img.name)
pathlib for path work. It is clearer and more portable than combining strings with os.path.
Text, data formats, and persistence
json,csv,configparser(common interchange and configuration formats)sqlite3(embedded SQL database stored in a single file)pickle(Python specific serialization)
import json
from pathlib import Path
prefs = {"theme": "dark", "items_per_page": 20}
Path("prefs.json").write_text(json.dumps(prefs, indent=2), encoding="utf-8")
loaded = json.loads(Path("prefs.json").read_text(encoding="utf-8"))
print(loaded["theme"]) # dark
Numeric, statistics, and randomness
math,cmath(real and complex math)statistics(mean, median, variance)random,secrets(pseudorandom utilities and secure tokens)decimal,fractions(exact arithmetic where it matters)
from statistics import mean
from decimal import Decimal
scores = [8.5, 9.0, 7.5]
print(mean(scores)) # 8.333...
price = Decimal("19.99") + Decimal("0.01")
print(price) # 20.00 (no binary rounding issues)
Dates, times, and time zones
datetime(dates, times, timedeltas)zoneinfo(IANA time zones, Python 3.9+)
from datetime import datetime
from zoneinfo import ZoneInfo
meeting = datetime(2025, 10, 22, 15, 0, tzinfo=ZoneInfo("Europe/London"))
print(meeting.isoformat())
Collections and functional tools
collections(Counter,deque,defaultdict,namedtuple)itertools(efficient iteration building blocks)functools(lru_cache,partial,cached_property)dataclasses,enum(lightweight data containers and enumerations)
from collections import Counter
from functools import lru_cache
counts = Counter("mississippi")
print(counts.most_common(2)) # [('s', 4), ('i', 4)]
@lru_cache(maxsize=256)
def fib(n):
return n if n < 2 else fib(n-1) + fib(n-2)
print(fib(30))
Networking and internet
urllib.request,urllib.parse(HTTP requests and URL parsing)http.client,http.server(low level HTTP and a simple dev server)socket(raw network sockets)
from urllib.request import urlopen
with urlopen("https://example.com") as resp:
print(resp.status)
print(resp.read(100))
Concurrency
threading(lightweight threads)multiprocessing(separate processes for CPU bound work)concurrent.futures(thread and process pools with a unified interface)asyncio(asynchronous I/O with async and await)
from concurrent.futures import ThreadPoolExecutor
def square(n): return n*n
with ThreadPoolExecutor() as pool:
results = list(pool.map(square, range(5)))
print(results) # [0, 1, 4, 9, 16]
CLI, logging, and testing
argparse(command line interfaces)logging,traceback(observability and error reports)unittest,doctest(testing frameworks)
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--limit", type=int, default=10)
args = parser.parse_args()
print(args.limit)
Imports, packaging, and resources
importlib,pkgutil(introspection and dynamic imports)importlib.resources(read data files packaged with your code)venv(create isolated environments)typing(type hints for better tooling and clarity)
from importlib.resources import files
data_path = files("mypackage.data") / "schema.json"
print(data_path) # path inside the installed package
The standard library is a practical toolbox. Learn a few modules well (path handling, JSON, datetime, collections, logging) and add others as your projects require them. This approach keeps code lean, portable, and easy to maintain.
Installing and managing third-party packages
Python’s power grows when you add libraries from the wider ecosystem. You install these libraries with pip and keep your project dependencies isolated with venv. The basic flow is: create a virtual environment for your project, activate it, install what you need, then record the versions so you can reproduce the environment later.
Create and use a virtual environment
# Create a virtual environment in the ".venv" folder
python -m venv .venv
# Activate it (macOS/Linux)
source .venv/bin/activate
# Activate it (Windows PowerShell)
.venv\Scripts\Activate.ps1
python -m pip over a bare pip command. It guarantees you are using the same interpreter’s installer as your active environment.
Install, upgrade, and remove packages
# Install a package (latest version)
python -m pip install requests
# Install a specific version
python -m pip install "requests==2.32.3"
# Upgrade a package
python -m pip install --upgrade requests
# Uninstall a package
python -m pip uninstall requests
# See what is installed
python -m pip list
Record and reproduce environments
# Freeze the current environment to a requirements file
python -m pip freeze > requirements.txt
# Recreate the same environment elsewhere
python -m pip install -r requirements.txt
requirements.txt pins all transitive versions. This is ideal for applications you deploy. For libraries you publish, pin less aggressively and test across ranges.
Installing your own project
If your project has a pyproject.toml with a [project] table, you can install it into the environment (useful for editable imports during development):
# Editable install (live changes without reinstalls)
python -m pip install -e .
Platform-friendly launcher notes
# Windows can use the Python launcher
py -m venv .venv
py -m pip install requests
# Many Unix systems use python3
python3 -m venv .venv
python3 -m pip install requests
sudo pip install) for global installs. Prefer per-project virtual environments to prevent conflicts and permission issues.
Command-line tools vs project libraries
Some packages are primarily command-line tools rather than libraries you import. Install these in isolated contexts so they do not clash with your project dependencies:
# Install a CLI tool globally but isolated per tool
python -m pip install --user pipx
pipx install black
pipx for CLI tools (formatters, linters, scaffolding utilities). Use venv + pip for your application’s importable libraries.
Troubleshooting
# If pip itself needs an update
python -m pip install --upgrade pip
# If an install fails, clear the build cache and try again
python -m pip cache purge
python -m pip install PACKAGE_NAME
# Inspect where a package came from
python -m pip show PACKAGE_NAME
With a reliable pattern (create a virtual environment, install with python -m pip, freeze versions for apps, and keep tools separated with pipx), you can manage third-party packages confidently and reproducibly.
Chapter 10: Files and Input/Output
Programs often need to exchange information with the outside world, such as reading data, saving results, or communicating through text streams. Python’s input and output (I/O) system makes this process simple and expressive. Whether you are reading a configuration file, writing logs, or processing large datasets, Python provides consistent interfaces that work across platforms and file types.
The core concept is the file object. A file object represents an open connection to a file or stream, allowing you to read, write, or both. You usually open a file with the built-in open() function, which returns a handle that you can use within a with statement for automatic cleanup. Once open, you can operate on the file using methods like read(), write(), or readlines().
with open("example.txt", "w", encoding="utf-8") as f:
f.write("Hello, world!\n")
This pattern ensures the file is properly closed, even if an error occurs. It is part of Python’s design philosophy that explicit resource management should be easy and safe.
Beyond ordinary text files, Python’s I/O system extends to binary data, temporary files, compressed archives, network streams, and in-memory buffers. The standard library offers modules for each of these, from io and pathlib to gzip and zipfile. Working with files becomes a matter of combining these tools cleanly and predictably.
In this chapter you will learn how to open, read, and write files safely using with blocks, the differences between text and binary modes, how to work with file paths and directories using pathlib, how to handle exceptions and detect file errors, and how to use modules for structured data such as CSV, JSON, and binary formats
Reading and writing files
File I/O in Python revolves around the built-in open() function. It takes a filename, a mode string that determines how the file will be used, and optional parameters like the character encoding. Once opened, the returned file object provides methods for reading and writing data.
Opening a file
f = open("notes.txt", "r", encoding="utf-8")
content = f.read()
f.close()
Although you can open and close files manually, the preferred and safest pattern is to use a with statement. This automatically closes the file even if an error occurs.
with open("notes.txt", "r", encoding="utf-8") as f:
content = f.read()
print(content)
with open(...) when dealing with files. It prevents resource leaks and ensures files are closed cleanly.
Writing text files
To create or overwrite a file, use mode "w". To append to an existing file, use "a". In both cases, open the file in text mode and specify the encoding explicitly.
with open("output.txt", "w", encoding="utf-8") as f:
f.write("First line\n")
f.write("Second line\n")
with open("output.txt", "a", encoding="utf-8") as f:
f.write("Appended line\n")
You can also write multiple lines at once using writelines() if you already have a list of strings.
lines = ["alpha\n", "beta\n", "gamma\n"]
with open("greek.txt", "w", encoding="utf-8") as f:
f.writelines(lines)
Reading files efficiently
When reading, you can process entire contents at once with read(), a fixed number of bytes with read(n), or line by line with readline() or iteration.
with open("notes.txt", "r", encoding="utf-8") as f:
for line in f:
print(line.strip())
Reading line by line is memory efficient and suitable for large files. Each iteration reads only the next line from the stream.
Binary files
When working with non-text data (for example, images, sound files, or compiled programs), open the file in binary mode by adding "b" to the mode string. In this mode, read and write operations deal with bytes objects rather than strings.
# Copy a binary file safely
with open("photo.jpg", "rb") as source, open("backup.jpg", "wb") as target:
target.write(source.read())
Modes summary
| Mode | Meaning |
"r" | Read text (default) |
"w" | Write text, truncating the file if it exists |
"a" | Append text to the end |
"rb" | Read binary data |
"wb" | Write binary data |
"ab" | Append binary data |
"r+" | Read and write (must already exist) |
"w+" | Create or overwrite, then read and write |
File positions and seeking
You can move the current position in an open file using seek() and find out where you are with tell().
with open("data.bin", "rb") as f:
f.seek(10) # skip first 10 bytes
chunk = f.read(4) # read next 4 bytes
print(f.tell()) # current position
These same patterns apply whether you are dealing with small text files or multi-gigabyte binary data: open safely, process in chunks, and close automatically.
Context managers
Context managers make resource handling predictable, elegant, and safe. They are one of Python’s most practical features for writing reliable code that always cleans up after itself.
Python’s with statement is the standard way to manage resources that need to be set up and then cleaned up automatically. It ensures that even if an error occurs inside the block, the resource is released properly. Files are the most common example, but the same pattern works for network connections, database sessions, and locks.
with open("log.txt", "w", encoding="utf-8") as f:
f.write("Session started\n")
f.write("All systems nominal\n")
When the with block ends, Python calls the file object’s __exit__() method, which closes the file automatically. This avoids subtle bugs such as leaving files open or forgetting to release system resources.
with statement is not limited to files. Any object that defines __enter__() and __exit__() methods can be used as a context manager.
How it works
When you write:
with open("example.txt", "r", encoding="utf-8") as f:
data = f.read()
Python performs the following steps:
- Calls
open()to get a file object. - Invokes the object’s
__enter__()method and assigns its result tof. - Executes the block of code inside the
withstatement. - When the block finishes (even if an exception is raised), calls the object’s
__exit__()method to handle cleanup.
Creating your own context manager
You can define custom context managers by writing a class that implements __enter__ and __exit__. This is useful when you want to ensure something is always released or reset.
class ManagedResource:
def __enter__(self):
print("Resource acquired")
return self
def __exit__(self, exc_type, exc_val, exc_tb):
print("Resource released")
with ManagedResource() as r:
print("Using resource")
Output:
Resource acquired
Using resource
Resource released
If an exception occurs inside the block, Python passes it to __exit__() as arguments (exc_type, exc_val, and exc_tb). If __exit__() returns True, the exception is suppressed; otherwise it propagates normally.
@contextmanager can simplify setup and teardown logic while keeping the function concise. Use this pattern when you want to manage a temporary resource without defining a full class.
Using the contextlib module
For lightweight cases, you can create context managers using the contextlib module’s @contextmanager decorator. This turns a generator function into a context manager with minimal code.
from contextlib import contextmanager
@contextmanager
def simple_logger(filename):
f = open(filename, "a", encoding="utf-8")
try:
yield f
finally:
f.close()
with simple_logger("events.log") as log:
log.write("Event recorded\n")
Path handling with pathlib
The pathlib module provides an object-oriented interface for working with file system paths. It replaces many of the older os.path and glob functions with cleaner, more readable syntax. A Path object represents a file or directory, and you can use it to inspect, combine, read, and write paths safely across platforms.
from pathlib import Path
p = Path("data") / "records.txt"
print(p) # data/records.txt
print(p.exists()) # True or False
print(p.parent) # data
Unlike string manipulation, pathlib automatically uses the correct separators for each operating system. It also provides intuitive operators for joining paths and accessing metadata.
Creating and inspecting paths
p = Path("/home/robin/documents/report.txt")
print(p.name) # report.txt
print(p.stem) # report
print(p.suffix) # .txt
print(p.parent) # /home/robin/documents
print(p.anchor) # / (root)
Path objects are immutable. Operations such as p / "subdir" or p.with_suffix(".csv") return new paths without altering the original.
Creating directories and files
from pathlib import Path
folder = Path("reports/2025")
folder.mkdir(parents=True, exist_ok=True)
file = folder / "summary.txt"
file.write_text("Quarterly summary pending", encoding="utf-8")
The mkdir() method creates directories, and write_text() writes a full text file in one call. The matching read_text() method reads a file entirely into a string.
content = file.read_text(encoding="utf-8")
print(content)
Iterating through directories
You can loop through directory contents easily using iterdir() or pattern matching with glob() and rglob().
for path in Path("logs").glob("*.txt"):
print(path.name)
# Recursive search
for path in Path("src").rglob("*.py"):
print(path)
rglob() searches recursively through all subdirectories. Use it carefully on large trees, as it can traverse thousands of files.
Combining with file I/O
Each Path object can open files directly, giving a natural link between path handling and file reading or writing.
with Path("output/data.txt").open("w", encoding="utf-8") as f:
f.write("Example output\n")
This is equivalent to open(str(path)) but more idiomatic when you already have a Path object.
Checking file types and metadata
p = Path("photo.jpg")
print(p.exists()) # Does the path exist?
print(p.is_file()) # Is it a file?
print(p.is_dir()) # Is it a directory?
print(p.stat().st_size) # File size in bytes
The stat() method returns system-level information such as size, timestamps, and permissions. You can also resolve paths to absolute locations and check relationships between them:
print(p.resolve()) # Absolute path
print(p.is_relative_to("photos")) # Python 3.9+
Cross-platform paths
pathlib provides PurePath, PurePosixPath, and PureWindowsPath classes for manipulating paths without touching the file system, which is useful when writing tools that must handle multiple platforms.
from pathlib import PureWindowsPath
p = PureWindowsPath("C:/Users/Robin/Documents/file.txt")
print(p.parts)
# ('C:\\', 'Users', 'Robin', 'Documents', 'file.txt')
pathlib for all modern Python code. It integrates smoothly with os, shutil, and file I/O, and eliminates most string-based path bugs.
pathlib brings clarity and consistency to filesystem operations. By treating paths as structured objects rather than raw strings, you write code that is both safer and more expressive across every platform Python supports.
Simple console I/O
Console input and output are the simplest forms of interaction between a Python program and its user. You can display information with print() and collect user input with input(). Both functions handle text automatically and integrate cleanly with Python’s string formatting tools.
Printing output
The print() function writes text to standard output (usually the terminal). It automatically converts its arguments to strings and separates them with spaces by default.
print("Hello", "world") # Hello world
print("Answer:", 42) # Answer: 42
You can control the separator and the line ending using keyword arguments:
print("apple", "banana", "cherry", sep=", ")
# apple, banana, cherry
print("No newline here", end="")
# end="" prevents line break
print() function writes to sys.stdout by default, but you can redirect output to any writable file or stream using the file= argument.
with open("log.txt", "w", encoding="utf-8") as f:
print("Logging started", file=f)
By default, print() flushes its output when a newline is written or when the program ends. For immediate output (useful in progress displays or logging), pass flush=True.
print("Processing...", flush=True)
Reading input
The built-in input() function pauses execution and waits for the user to type a line of text, returning it as a string.
name = input("Enter your name: ")
print(f"Hello, {name}!")
input() is always a string. If you need numeric data, convert it explicitly using int() or float().
age = int(input("Enter your age: "))
print("Next year you will be", age + 1)
Redirecting and scripting
Both input() and print() work seamlessly in scripts or interactive sessions. When reading from redirected input (for example, when piping a file into a program), input() will read lines from that stream instead of a physical keyboard.
# data.txt contains:
# Alice
# Bob
# Charlie
for line in open("data.txt", encoding="utf-8"):
print("Name:", line.strip())
Formatting output neatly
You can combine print() with f-strings or format specifications for aligned or formatted text.
for x in range(1, 6):
print(f"{x:2d} squared is {x*x:3d}")
item = "Apples"
price = 1.49
print(f"{item:<10} £{price:>5.2f}")
# Apples £ 1.49
format() method, f-string formatting syntax, or libraries like textwrap and rich for styled console output.
Working with JSON and CSV
Many programs need to store or exchange structured data. Two of the most common text formats for this are JSON (JavaScript Object Notation) and CSV (Comma-Separated Values). Python’s standard library provides dedicated modules (json and csv) that make it easy to read and write these formats safely and portably.
JSON (JavaScript Object Notation)
JSON represents data as nested structures of dictionaries, lists, strings, numbers, booleans, and null (represented as None in Python). It is widely used for configuration files, APIs, and data exchange between programs.
import json
from pathlib import Path
# Python data structure
user = {
"name": "Ada Lovelace",
"skills": ["mathematics", "programming"],
"active": True,
"projects": 5
}
# Write JSON file
path = Path("user.json")
path.write_text(json.dumps(user, indent=2), encoding="utf-8")
# Read JSON file back
loaded = json.loads(path.read_text(encoding="utf-8"))
print(loaded["name"])
The json module provides four key functions:
json.dump(obj, file)write to an open filejson.load(file)read from an open filejson.dumps(obj)return a JSON stringjson.loads(string)parse a JSON string
# Equivalent file-based approach
with open("user.json", "w", encoding="utf-8") as f:
json.dump(user, f, indent=2)
with open("user.json", encoding="utf-8") as f:
loaded = json.load(f)
indent for readability and sort_keys=True when you need predictable key order. This is useful for diffing or version-controlling JSON files.
CSV (Comma-Separated Values)
CSV files store tabular data where each line represents a row and columns are separated by commas (or sometimes other delimiters such as semicolons or tabs). The csv module provides reader and writer objects that handle quoting and escaping correctly, so you don’t have to parse manually.
import csv
from pathlib import Path
records = [
{"name": "Alice", "age": 30, "city": "London"},
{"name": "Bob", "age": 25, "city": "Paris"},
{"name": "Charlie", "age": 35, "city": "New York"},
]
# Write CSV
with open("people.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=["name", "age", "city"])
writer.writeheader()
writer.writerows(records)
# Read CSV
with open("people.csv", newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
for row in reader:
print(row["name"], "is", row["age"], "years old")
The csv module automatically handles quoting around values that contain commas or newlines. When using DictWriter and DictReader, each row is mapped cleanly to and from a dictionary, making your code clearer and more reliable.
newline="" on Windows. This prevents double line breaks caused by automatic newline translation.
Chapter 11: Error Handling and Exceptions
No program runs perfectly all the time. Files may be missing, network connections can fail, and user input often contains surprises. When such problems occur, Python signals them through exceptions. These are special runtime events that interrupt normal flow and carry information about what went wrong.
Effective handling of exceptions allows a program to recover gracefully instead of crashing. Python’s design makes this process explicit and readable. Rather than hiding errors, it encourages you to anticipate and manage them with clear structures such as try, except, and finally.
This chapter explores how Python reports and handles errors, how to raise and define custom exceptions, and how to use exception handling as part of robust, maintainable program design.
Built-in Exception Hierarchy
Python represents all errors as objects derived from the built-in BaseException class. This creates a structured hierarchy where related exceptions share common ancestry, allowing developers to catch broad categories of problems or handle specific ones precisely.
At the top of the hierarchy is BaseException, from which the standard Exception class inherits. Most user-defined and built-in errors derive from Exception, while a few special cases such as SystemExit, KeyboardInterrupt, and GeneratorExit descend directly from BaseException. These special cases exist to signal interpreter-level events rather than program bugs.
The hierarchy begins roughly as follows:
re class="tight">BaseException
├── SystemExit
├── KeyboardInterrupt
├── GeneratorExit
└── Exception
├── ArithmeticError
│ ├── ZeroDivisionError
│ ├── OverflowError
│ └── FloatingPointError
├── LookupError
│ ├── IndexError
│ └── KeyError
├── ImportError
├── AttributeError
├── TypeError
├── ValueError
├── FileNotFoundError
├── OSError
└── RuntimeError
Because all exceptions inherit from a shared base, they can be caught either individually or collectively. For example, catching Exception will intercept most runtime errors, while catching BaseException will also include system-related interruptions, something that is rarely desirable in ordinary code.
try / except / else / finally Structure
Python’s primary mechanism for handling runtime errors is the try statement. It lets you define a block of code to attempt, and one or more accompanying blocks that specify what should happen if something goes wrong. This keeps error handling separate from normal logic, improving clarity and flow.
The full structure can include four parts:
try:
# code that might raise an exception
except SomeError:
# code to handle the error
else:
# code that runs only if no exception occurred
finally:
# code that runs no matter what happens
Only the try and at least one except clause are required. The else clause is optional and runs when no exceptions are raised, which makes it a good place for actions that depend on successful completion of the try block. The finally clause always executes, whether or not an error occurred, and is typically used for cleanup actions such as closing files or releasing resources.
Here is a simple example:
try:
f = open("data.txt")
content = f.read()
except FileNotFoundError:
print("File not found.")
else:
print("File read successfully.")
finally:
f.close()
If the file data.txt does not exist, the except block runs and the program continues normally. If the file is opened and read without issue, the else block runs instead. In either case, the finally block ensures that the file is properly closed. This structure encourages predictable, fault-tolerant programs that can handle unexpected conditions gracefully.
Raising Exceptions
While many exceptions occur automatically during runtime, you can also generate them yourself when your program encounters a situation that warrants stopping normal flow. This is done with the raise statement. It allows you to signal that something has gone wrong and to communicate what kind of problem it is.
The simplest form raises a specific built-in or custom exception class:
raise ValueError("Invalid input format")
When Python encounters this statement, it creates a ValueError object carrying the message provided, and then searches upward through the call stack for a matching except block. If no handler is found, the program terminates and reports the error.
Any subclass of BaseException can be raised, but conventionally you use Exception or its descendants. You can also raise an existing exception instance that was created earlier:
e = RuntimeError("Something went wrong")
raise e
If you need to re-raise the current exception from within an except block, simply use raise with no argument. This preserves the original traceback and type information:
try:
process()
except ValueError:
log_error()
raise # re-raise the same ValueError
Raising exceptions is an important part of designing reliable functions and modules. It allows you to enforce invariants, flag invalid states, or delegate responsibility for recovery to higher levels of your program. When used consistently, exceptions become a clear and explicit communication mechanism between components.
Custom Exception Classes
Python allows you to define your own exception types, giving structure and meaning to the specific errors your program might encounter. Custom exceptions make debugging easier and provide a clearer interface for anyone using your code, because each error type can convey precise intent rather than relying on generic messages.
To define a custom exception, create a new class that inherits from Exception (or from one of its subclasses if you want related behaviour). A simple example looks like this:
class DataFormatError(Exception):
"""Raised when data is not in the expected format."""
pass
This class behaves like any other exception but can be caught independently, making it easier to distinguish between different error types:
try:
raise DataFormatError("Corrupted input file")
except DataFormatError as e:
print("Data problem:", e)
except Exception as e:
print("Other error:", e)
You can also extend custom exceptions with additional attributes to carry more context, such as file names or error codes:
class ConfigurationError(Exception):
def __init__(self, message, filename):
super().__init__(message)
self.filename = filename
try:
raise ConfigurationError("Missing section", "settings.ini")
except ConfigurationError as e:
print(f"{e.filename}: {e}")
By grouping related custom exceptions under a shared base class, you can build entire hierarchies that mirror your program’s structure. This helps catch classes of errors together while preserving the ability to handle individual cases when needed.
Defensive Coding Patterns
Defensive coding complements exception handling by preventing many failures and by making the unavoidable ones easier to diagnose and recover from.
Good error handling begins before exceptions occur. Defensive patterns reduce surprising states, keep tracebacks meaningful, and make failures recoverable. These practices work with Python’s exception model to produce code that is clear and resilient.
Validate Early, Fail Fast
Check inputs at boundaries, and raise descriptive exceptions when assumptions are violated. Early failure keeps bugs close to their cause and prevents corrupted state from spreading.
def read_port(env):
port = env.get("PORT")
if port is None:
raise KeyError("PORT is required")
try:
return int(port)
except ValueError as e:
raise ValueError("PORT must be an integer") from e
EAFP over LBYL
Easier to ask forgiveness than permission is common in Python. Prefer trying the operation and catching specific failures, rather than checking every precondition first, when that leads to simpler code.
# EAFP - Easier to ask Forgiveness than Permission
try:
value = mapping[key]
except KeyError:
value = default
# LBYL - Look Before You Leap
if key in mapping:
value = mapping[key]
else:
value = default
Catch Narrowly, Then Generalize
Order handlers from specific to general, and avoid bare except:. Handle only what you can recover from. Let unrelated errors propagate.
try:
do_work()
except (TimeoutError, ConnectionError) as e:
recover(e)
except Exception:
# unexpected, bubble up after logging
raise
Use else and finally Intentionally
Place success-only actions in else. Put cleanup in finally so it runs even if an error occurs.
f = open("report.txt", "w")
try:
write_report(f)
except OSError as e:
log(e)
raise
else:
print("Report written.")
finally:
f.close()
Prefer Context Managers
Context managers encapsulate setup and teardown, which prevents resource leaks and simplifies error paths.
from contextlib import suppress
with open("data.json") as f:
data = f.read()
# ignore a specific, expected error deliberately
with suppress(FileNotFoundError):
os.remove("tempfile.tmp")
Preserve Context with Chaining
Wrap low-level errors in domain-specific ones and keep the original traceback with raise ... from .... This gives both high-level meaning and the root cause.
try:
payload = json.loads(raw)
except json.JSONDecodeError as e:
raise ValueError("Invalid configuration payload") from e
Guard Clauses for Clarity
Return early when preconditions are not met. This reduces nesting and keeps the happy path clear.
def normalize(vec):
if not vec:
return []
length = sum(x * x for x in vec) ** 0.5
if length == 0:
raise ValueError("Zero-length vector")
return [x / length for x in vec]
Do Not Silence Errors Accidentally
Empty handlers hide problems and complicate debugging. If you catch an exception and cannot recover, log and re-raise.
try:
update_index()
except Exception:
logger.exception("Index update failed")
raise
Be Deliberate About Retries
Retries are useful for transient faults such as network hiccups. Apply caps and backoff to avoid tight loops. Only retry idempotent operations.
import time
def fetch_with_retry(get, attempts=3, base=0.2):
for i in range(attempts):
try:
return get()
except (TimeoutError, ConnectionError):
if i == attempts - 1:
raise
time.sleep(base * 2 ** i)
Use assert for Internal Invariants
Assertions document assumptions for developers and may be disabled with optimization flags. Do not use them for user input or runtime validation.
def average(xs):
assert all(isinstance(x, (int, float)) for x in xs)
return sum(xs) / len(xs)
Design Clear Exception APIs
Choose meaningful exception types, document what your functions raise, and keep messages actionable. Group related custom exceptions under a shared base class so callers can handle families of failures cleanly.
Chapter 12: Object-Oriented Programming
Object-Oriented Programming (OOP) is a way of structuring code around data and behaviour that belong together. Rather than writing a series of standalone functions, you create objects, which are self-contained units that hold both state (data) and methods (functions that act on that data). This model helps you build systems that are modular, reusable, and easier to reason about as they grow.
Python fully supports object-oriented principles such as encapsulation, inheritance, and polymorphism, yet it does so with remarkable flexibility. Unlike many strictly typed or class-based languages such as Java or C++, Python blends object orientation naturally with its dynamic, interpretive nature. Almost everything in Python, such as numbers, strings, lists, functions, and even modules, is an object. Each has a type, attributes, and methods that define how it behaves.
Defining your own classes allows you to create new object types suited to your program’s domain. Classes serve as blueprints, while instances represent individual realisations of those blueprints. This pattern lets you organise data and logic together, avoiding scattered global state and repeated code.
This chapter explores how to define and use classes, create and access attributes, design constructors, apply inheritance, and use special methods to integrate seamlessly with Python’s built-in behaviour. The goal is not only to understand the syntax but to think in terms of objects, including how they interact, how they represent concepts, and how they simplify complex programs.
Classes, Instances, and Attributes
Classes are the fundamental building blocks of Python’s object system. A class defines the structure and behaviour of a particular kind of object, while each instance represents one concrete example of that class. The class acts as a blueprint, describing what data (attributes) and actions (methods) its instances will have.
You define a class using the class keyword. Inside the class body, you can define attributes directly or through methods:
class Dog:
species = "Canis familiaris" # class attribute
def __init__(self, name, age):
self.name = name # instance attribute
self.age = age
Here, species belongs to the class itself (it is shared across all dogs) while name and age belong to each instance and can differ for every object you create. You make an instance by calling the class like a function:
buddy = Dog("Buddy", 4)
luna = Dog("Luna", 2)
Each instance keeps its own state, accessible via dot notation:
print(buddy.name) # Buddy
print(luna.species) # Canis familiaris
When Python looks up an attribute (for example luna.species), it first checks whether it exists on the instance, then on the class. This shared lookup mechanism makes it easy to define default data or behaviour at the class level while allowing instances to override it when necessary.
The __init__ Constructor and self
When you create an instance of a class, Python automatically calls a special method named __init__. This method acts as the object’s constructor: it runs immediately after a new instance is created, allowing you to set up its initial state. Most classes define __init__ to accept parameters and assign them to instance attributes.
class Book:
def __init__(self, title, author, year):
self.title = title
self.author = author
self.year = year
self is a convention, not a keyword. You could technically use another name, but doing so would confuse readers. Always use self for clarity and consistency.
The first parameter of __init__ is always self, which represents the instance being created. When you call Book("1984", "George Orwell", 1949), Python internally does the following:
re class="tight">b = Book.__new__(Book) # create empty instance
Book.__init__(b, "1984", "George Orwell", 1949)
This process ensures that every object has a distinct identity and its own data. You can then access those attributes directly:
novel = Book("Dune", "Frank Herbert", 1965)
print(novel.title) # Dune
print(novel.year) # 1965
Inside any method, self always refers to the current instance. It allows each object to manage its own state while sharing the same class definition.
__init__ unless necessary.
Methods and Encapsulation
Methods are functions that belong to a class. They describe the behaviours of its instances and usually operate on the instance’s data. Every method takes self as its first parameter, giving it access to the instance’s attributes and other methods.
class Counter:
def __init__(self):
self.value = 0
def increment(self):
self.value += 1
def reset(self):
self.value = 0
def display(self):
print("Current value:", self.value)
When you call c.increment(), Python automatically passes the instance c as the first argument to the method. This is equivalent to Counter.increment(c) but is written in a more natural and readable way.
c = Counter()
c.increment()
c.increment()
c.display() # Current value: 2
Encapsulation refers to the practice of bundling data (attributes) and behaviour (methods) together, while controlling access to an object’s internal state. In Python, this is a matter of convention rather than enforced restriction. Attribute names beginning with an underscore indicate that they are intended for internal use only.
class Account:
def __init__(self, owner, balance=0):
self.owner = owner
self._balance = balance # internal attribute
def deposit(self, amount):
if amount <= 0:
raise ValueError("Deposit must be positive")
self._balance += amount
def withdraw(self, amount):
if amount > self._balance:
raise ValueError("Insufficient funds")
self._balance -= amount
def get_balance(self):
return self._balance
This convention allows for safe modification later. For example, turning _balance into a computed property without breaking existing code. Python also supports name mangling for attributes prefixed with two underscores, which makes them harder to access accidentally from outside the class.
class Example:
def __init__(self):
self.__hidden = 42
e = Example()
# e.__hidden # AttributeError
print(e._Example__hidden) # accessible but discouraged
Inheritance and super()
Inheritance allows you to create new classes that build upon or extend existing ones. A derived (or child) class inherits attributes and methods from its base (or parent) class, enabling code reuse and hierarchical design. This approach helps organise related classes and avoids repetition of shared logic.
class Animal:
def __init__(self, name):
self.name = name
def speak(self):
return "(silent)"
Now a subclass can extend or override the behaviour of Animal:
class Dog(Animal):
def speak(self):
return "Woof!"
class Cat(Animal):
def speak(self):
return "Meow!"
Each subclass inherits the __init__ method from Animal, so you can instantiate them directly:
dog = Dog("Buddy")
cat = Cat("Luna")
print(dog.name, dog.speak()) # Buddy Woof!
print(cat.name, cat.speak()) # Luna Meow!
If the subclass needs to extend rather than replace the parent’s constructor, you can call it explicitly using super(). This function returns a temporary object that allows access to the parent class without naming it directly. It is the preferred modern approach for cooperative multiple inheritance.
class Bird(Animal):
def __init__(self, name, can_fly=True):
super().__init__(name)
self.can_fly = can_fly
def speak(self):
return "Chirp!"
super() to directly naming the parent class. It automatically handles the correct method resolution order, which is especially important when multiple inheritance is involved.
Here, super().__init__(name) ensures that the base class’s initialisation logic runs before adding the subclass-specific attribute can_fly. This preserves proper setup even in complex inheritance chains.
ClassName.mro() or ClassName.__mro__ if you need to understand how methods are resolved.
Method Overriding and Polymorphism
When a subclass defines a method with the same name as one in its parent, the new definition overrides the inherited version. This lets subclasses specialise or adapt behaviour without modifying the base class itself, which is a key principle of extensible design.
class Shape:
def area(self):
return 0
class Rectangle(Shape):
def __init__(self, width, height):
self.width = width
self.height = height
def area(self):
return self.width * self.height
Here, Rectangle overrides Shape.area() to compute a meaningful value. The same interface name (area) can produce different results depending on the actual type of the object. This ability is called polymorphism, which literally means “many forms.” It allows different objects to respond to the same message in ways appropriate to their type. This makes code both flexible and scalable:
shapes = [Shape(), Rectangle(4, 5)]
for s in shapes:
print(s.area())
Although both objects are treated uniformly, each calls the version of area() defined in its own class. This is one of the most powerful aspects of OOP; code that can operate on abstract interfaces rather than concrete types.
super() when you need to extend, not replace, its behaviour.
class LoggedShape(Shape):
def area(self):
print("Calculating area...")
result = super().area()
print("Done.")
return result
Polymorphism also enables duck typing, a Python idiom that focuses on what an object can do rather than what it is. It comes from the saying “If it walks like a duck and quacks like a duck, it’s probably a duck.” In programming terms, it means that Python cares about what an object can do, not what type it claims to be. If an object implements the expected methods, it can be used interchangeably, regardless of its class hierarchy.
class Circle:
def area(self):
return 3.14 * (5 ** 2)
def print_area(shape):
print("Area:", shape.area())
print_area(Rectangle(3, 4))
print_area(Circle())
Data Classes (@dataclass)
Many classes exist primarily to store data rather than to define complex behaviour. Python’s @dataclass decorator (introduced in version 3.7) simplifies such classes by automatically generating common methods such as __init__, __repr__, __eq__, and others based on declared fields.
from dataclasses import dataclass
@dataclass
class Point:
x: float
y: float
This single declaration defines a full-featured class that supports initialisation, readable string output, and value comparison:
p1 = Point(2.5, 4.0)
p2 = Point(2.5, 4.0)
print(p1) # Point(x=2.5, y=4.0)
print(p1 == p2) # True
@dataclass automatically provides __init__, __repr__, and __eq__. You can disable or customise any of these with parameters such as repr=False or eq=False.
Each attribute is declared with a type hint, and you can also define default values or computed defaults using field() from the dataclasses module:
from dataclasses import dataclass, field
@dataclass
class Book:
title: str
author: str
year: int = 0
tags: list = field(default_factory=list)
b = Book("1984", "George Orwell")
print(b.tags) # []
The default_factory is particularly important for mutable types like lists and dictionaries. It ensures that each instance receives its own new object rather than sharing one across all instances.
default_factory for containers, otherwise changes made in one instance will appear in others.
You can also make dataclasses immutable by passing frozen=True to the decorator. This prevents modification of attributes after creation, similar to named tuples:
@dataclass(frozen=True)
class Color:
r: int
g: int
b: int
c = Color(255, 200, 150)
# c.r = 0 # would raise FrozenInstanceError
Dataclasses streamline the creation of lightweight, structured objects. They encourage explicit, readable declarations and remove much of the boilerplate associated with small data containers.
Class Methods vs. Static Methods
Not all methods in a class operate on individual instances. Sometimes you need behaviour that relates to the class itself, or utility functions that logically belong to the class but do not depend on instance or class state. Python provides two decorators (@classmethod and @staticmethod) to support these cases.
Class Methods
A class method receives the class itself as its first argument, traditionally named cls. It can access or modify class-level data and is often used as an alternative constructor or to maintain state shared across all instances.
class Book:
total_created = 0
def __init__(self, title):
self.title = title
Book.total_created += 1
@classmethod
def from_series(cls, series_name, number):
title = f"{series_name} Vol. {number}"
return cls(title)
b1 = Book("Python Essentials")
b2 = Book.from_series("Python Handbook", 2)
print(b1.title) # Python Essentials
print(b2.title) # Python Handbook Vol. 2
print(Book.total_created) # 2
Static Methods
A static method is simpler as it does not receive an implicit first argument (self or cls). It behaves like a normal function but resides inside the class’s namespace for organisational purposes. It is useful for operations that are conceptually related to the class but independent of its state.
class MathUtils:
@staticmethod
def area_of_circle(radius):
return 3.14159 * (radius ** 2)
print(MathUtils.area_of_circle(3)) # 28.27431
Static methods do not require access to instance or class attributes, yet keeping them inside the class helps group related functionality together, keeping the design coherent.
- Use
instance methodswhen the behaviour depends on instance data. - Use
class methodswhen it acts on the class as a whole or creates new instances. - Use
static methodswhen the method neither accesses nor modifies instance or class state.
These decorators promote cleaner design by clarifying a method’s intended scope and by keeping related functionality together in one logical place.
Introduction to Special (“Dunder”) Methods
Python objects can define special methods, also known as dunder methods because their names begin and end with double underscores (for example, __init__, __str__, __len__). These methods let your classes integrate smoothly with Python’s built-in operations and syntax. They allow user-defined objects to behave like native types.
Special methods are not called directly in normal code. Instead, Python automatically invokes them when you use certain operators or functions. For instance, when you write len(obj), Python calls obj.__len__() internally. Likewise, print(obj) triggers obj.__str__().
class Vector:
def __init__(self, x, y):
self.x = x
self.y = y
def __repr__(self):
return f"Vector({self.x}, {self.y})"
def __str__(self):
return f"({self.x}, {self.y})"
def __add__(self, other):
return Vector(self.x + other.x, self.y + other.y)
def __len__(self):
return 2
v1 = Vector(3, 4)
v2 = Vector(1, 2)
print(v1) # (3, 4)
print(v1 + v2) # (4, 6)
print(len(v1)) # 2
__repr__ for debugging and developer-facing output, and __str__ for user-friendly display. If only __repr__ is defined, Python uses it as a fallback for str().
There are many categories of dunder methods, including:
- Object creation and representation:
__new__,__init__,__repr__,__str__ - Comparison and hashing:
__eq__,__lt__,__hash__ - Arithmetic and bitwise operators:
__add__,__sub__,__mul__,__and__, etc. - Container behaviour:
__len__,__getitem__,__setitem__,__iter__ - Context management:
__enter__and__exit__for use with thewithstatement
These hooks are what make Python’s object system so expressive. With them, you can define classes that behave just like built-in types: comparable, iterable, printable, and composable, without any special syntax beyond normal class definitions.
Chapter 13: Iterators, Generators, and Comprehensions
In Python, iteration is more than a simple loop mechanism. It is a core language concept that powers how data is accessed, produced, and transformed. Understanding iterators, generators, and comprehensions reveals how Python achieves elegance and efficiency when working with collections or streams of data.
An iterator is any object that can be stepped through one element at a time. You encounter them constantly in Python whenever you loop over a list, tuple, or file. A generator takes this concept further, allowing you to define your own sequences that produce values lazily, only when requested. This makes them ideal for memory-efficient or infinite data flows.
Comprehensions provide a concise syntax for constructing lists, sets, and dictionaries directly from existing iterables. They combine iteration, filtering, and transformation into a single readable expression that often replaces multiple lines of code.
This chapter explores how the iteration protocol operates, how to build your own iterators and generators, and how comprehensions offer a compact yet powerful way to create and manipulate data structures.
The iteration protocol
Python separates the idea of something you can loop over from the thing that actually produces successive values. An iterable is any object that can return an iterator. An iterator is the stateful object that yields one value at a time.
The protocol is simple. An iterable implements __iter__() and returns an iterator. An iterator implements __next__() to produce the next value, and raises StopIteration when there are no more values. Iterators also implement __iter__() that returns self, which allows them to be used directly in loops.
# Built-in support
nums = [10, 20, 30] # a list is an iterable
it = iter(nums) # calls nums.__iter__()
next(it) # calls it.__next__() → 10
next(it) # → 20
next(it) # → 30
# next(it) would now raise StopIteration
The for loop drives this protocol automatically. It calls iter(obj) once to get an iterator, then repeatedly calls next() until StopIteration is raised. The exception is handled internally, so you do not see it in normal loops.
next() call. Lists and tuples are iterables that produce a fresh iterator each time you call iter().
You can implement a custom iterator by defining a separate iterable that creates iterator instances, or by making the object act as its own iterator. A two-class design is often clearer because it allows multiple independent passes over the same data.
# A minimal custom iterator pair
class Countdown: # iterable
def __init__(self, start):
self.start = start
def __iter__(self):
return CountdownIter(self.start)
class CountdownIter: # iterator
def __init__(self, current):
self.current = current
def __iter__(self):
return self
def __next__(self):
if self.current <= 0:
raise StopIteration
value = self.current
self.current -= 1
return value
for n in Countdown(3):
print(n) # 3 2 1
Making the object its own iterator is shorter, but it becomes one-shot. After a full pass, the internal state is exhausted and another loop will find nothing.
# Self-iterating, one-shot design
class Once: # iterator and iterable in one
def __init__(self, items):
self.items = list(items)
self.index = 0
def __iter__(self):
return self
def __next__(self):
if self.index >= len(self.items):
raise StopIteration
item = self.items[self.index]
self.index += 1
return item
it = Once(["a", "b"])
list(it) # ['a', 'b']
list(it) # [] (already exhausted)
iter(callable, sentinel) creates an iterator that repeatedly calls the function until it returns the sentinel value. This is useful for chunked reads and stream processing without manual loops.
# Read fixed-size chunks until empty string
f = open("data.txt", "rt", encoding="utf-8")
for chunk in iter(lambda: f.read(1024), ""):
process(chunk)
f.close()
You will usually prefer generators for custom iteration because they express the same protocol with simpler code. The next section introduces generator functions and shows how yield implements the same __next__ logic without the bookkeeping.
Generator functions and yield
Generators are a high-level, elegant way to create iterators without having to write the full class boilerplate. A generator function looks like an ordinary function but uses the yield keyword instead of return to produce a series of values over time. Each call to yield suspends the function’s state, allowing it to resume exactly where it left off the next time next() is called.
def countdown(n):
while n > 0:
yield n
n -= 1
for i in countdown(3):
print(i) # 3 2 1
When countdown(3) is called, it does not execute immediately. Instead, it returns a generator object that follows the iterator protocol, implementing both __iter__() and __next__() internally. Each iteration resumes the function’s frame until a return statement or the end of the function raises StopIteration.
You can use yield within loops, conditionals, or even nested functions. The function can maintain local variables between yields, which makes it powerful for streaming or pipeline-style programming.
def even_numbers(limit):
for i in range(limit + 1):
if i % 2 == 0:
yield i
print(list(even_numbers(10))) # [0, 2, 4, 6, 8, 10]
Generators can also receive input using send(), raise exceptions inside themselves with throw(), and be closed explicitly with close(). These advanced controls allow them to behave like lightweight coroutines, but in most use cases they are employed simply to yield sequences of data.
next() again will immediately raise StopIteration. To start over, you must create a new generator by calling the function again.
Just as list comprehensions build lists, generator expressions produce generator objects using a compact syntax in parentheses. They are often used for streaming transformations or to feed data directly into functions like sum() or any() without creating intermediate collections.
# Generator expression example
squares = (n * n for n in range(5))
print(sum(squares)) # 0 + 1 + 4 + 9 + 16 = 30
Generator expressions
Generator expressions provide a compact, inline way to create generators without defining a separate function. They look similar to list comprehensions, but they use parentheses instead of square brackets. The result is a lazy iterator that yields items one at a time instead of constructing the whole collection in memory.
# Generator expression example
numbers = (n * n for n in range(5))
for value in numbers:
print(value) # 0 1 4 9 16
This syntax is particularly useful when passing data directly into another function. Because it does not build an intermediate list, it can handle large or infinite data sources efficiently.
# Streamed computation with no intermediate list
total = sum(n * n for n in range(1_000_000))
print(total)
Generator expressions follow the same syntax and semantics as comprehensions, including support for filtering conditions and nested loops.
# With a filter condition
evens = (n for n in range(10) if n % 2 == 0)
print(list(evens)) # [0, 2, 4, 6, 8]
They are commonly used inside function calls or combined with built-ins like any(), all(), max(), and min(). When a generator expression is the only argument to a function, the parentheses can be omitted for readability.
# Parentheses not required here
result = sum(n for n in range(100) if n % 3 == 0)
print(result)
Generator expressions offer a middle ground between clarity and efficiency. They are simple enough to write inline, yet powerful enough to replace many explicit loops. In the next section, we turn to comprehensions, Python’s most concise syntax for building lists, sets, and dictionaries from iterables.
Comprehensions
Comprehensions are one of Python’s most expressive features. They let you build new collections from existing iterables using a single, readable line of code. The idea is simple: take an input sequence, apply an expression to each element, optionally filter it, and produce a new list, set, or dictionary as the result.
There are three main types of comprehensions:
- List comprehensions → produce lists
- Set comprehensions → produce sets
- Dictionary comprehensions → produce dictionaries
# List comprehension
squares = [n * n for n in range(5)]
print(squares) # [0, 1, 4, 9, 16]
# Set comprehension
unique_lengths = {len(word) for word in ["one", "three", "seven"]}
print(unique_lengths) # {3, 5}
# Dictionary comprehension
mapping = {x: x * x for x in range(3)}
print(mapping) # {0: 0, 1: 1, 2: 4}
Comprehensions are equivalent to writing for loops that build collections manually, but they are more concise and often faster because they run in optimized bytecode.
# Filter and transform in one expression
names = ["Alice", "Bob", "Clara", "David"]
short_lower = [n.lower() for n in names if len(n) <= 4]
print(short_lower) # ['bob']
Comprehensions can also include multiple for clauses, which create nested loops. This makes them powerful for flattening or combining data from multiple sources.
# Cartesian product using nested comprehension
pairs = [(x, y) for x in [1, 2] for y in [3, 4]]
print(pairs) # [(1, 3), (1, 4), (2, 3), (2, 4)]
Set and dictionary comprehensions follow the same pattern but use braces. In dictionary comprehensions, the expression before the for must produce key–value pairs separated by a colon.
# Set comprehension
letters = {ch for ch in "abracadabra" if ch.isalpha()}
print(letters) # {'a', 'b', 'c', 'd', 'r'}
# Dictionary comprehension
squares = {n: n * n for n in range(5)}
print(squares) # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
Comprehensions combine clarity, speed, and power. They allow you to describe what you want to build rather than how to build it. Together with generators, they form a cornerstone of Python’s approach to expressive and efficient data processing.
Lazy evaluation and memory efficiency
One of Python’s design strengths lies in its ability to defer work until it is needed. This concept (known as lazy evaluation) is central to how iterators and generators operate. Instead of building large data structures all at once, Python can produce each value only when requested, saving both memory and processing time.
When you loop over a list, all its elements already exist in memory. When you loop over a generator, the values are computed on demand. This means you can work with sequences that are extremely large, or even infinite, without exhausting system resources.
# List: eager evaluation
nums_list = [n * n for n in range(1_000_000)]
# Generator: lazy evaluation
nums_gen = (n * n for n in range(1_000_000))
The list comprehension above allocates memory for one million squared integers. The generator expression, on the other hand, keeps only one value in memory at a time. The difference becomes significant in data pipelines or streaming tasks where the entire dataset cannot fit into memory.
Many built-in functions work seamlessly with lazy iterables. Functions like sum(), any(), all(), and max() consume iterators just as easily as lists. Likewise, the itertools module in the standard library provides a toolkit for chaining, filtering, and slicing lazy sequences without materializing them.
import itertools
# Infinite lazy sequence (never stored in memory)
counter = itertools.count(start=1)
# Take only the first 5 squares
first_five = (n * n for n in itertools.islice(counter, 5))
print(list(first_five)) # [1, 4, 9, 16, 25]
Lazy evaluation is not always the right choice. If you need to reuse data multiple times, random-access it, or serialize it, you must first convert it into a concrete structure such as a list or tuple. Otherwise, it will be exhausted after one pass.
Python’s iterator and generator model provides a balance between readability and efficiency. Whether you are streaming data from files, combining filters, or generating values in mathematical series, lazy evaluation allows you to scale smoothly without changing your code structure. It is a subtle but powerful idea; compute just enough, just in time.
Chapter 14: Functional Programming Elements
Functional programming is a style of coding that focuses on expressions, transformations, and data flow rather than changing state. In Python, you are not required to write in this style, but many of its built-in tools draw from functional programming principles. Understanding them can help you write cleaner, more predictable code.
Where object-oriented programming emphasizes objects and their behaviour, functional programming emphasizes functions as first-class citizens. This means they can be stored in variables, passed as arguments, returned from other functions, and composed to create new behaviours.
Python supports both approaches seamlessly. You can mix imperative, object-oriented, and functional techniques within the same program, choosing whichever is most natural for the task.
# Functions are first-class objects
def square(x):
return x * x
nums = [1, 2, 3, 4]
result = map(square, nums)
print(list(result)) # [1, 4, 9, 16]
In this chapter, you will explore Python’s functional features in depth—covering built-in functions such as map(), filter(), and reduce(); the use of lambda expressions; higher-order functions; and key concepts like immutability and side-effect-free design. Together, these ideas form a toolkit for writing code that is elegant, modular, and expressive.
Map, filter, reduce, and list comprehensions in depth
Python offers several tools that embody functional programming concepts. The most common are map(), filter(), and reduce(). These functions operate on iterables, applying a transformation, selection, or accumulation step without modifying the original data. Each one returns a new iterable or result that reflects the applied operation.
map()
The map() function applies a given function to every item in an iterable and returns an iterator of the results. It is equivalent to a loop that calls the function repeatedly and collects the outputs.
# Using map() to transform a list
def square(x):
return x * x
nums = [1, 2, 3, 4, 5]
result = map(square, nums)
print(list(result)) # [1, 4, 9, 16, 25]
You can also achieve the same result using a list comprehension, which is often preferred for its readability.
# Equivalent comprehension
nums = [1, 2, 3, 4, 5]
result = [x * x for x in nums]
map() is most useful when combining multiple iterables or when passing functions dynamically. Comprehensions are clearer when you are transforming a single sequence.
filter()
The filter() function selects elements from an iterable for which a predicate function returns True. The result is another iterator containing only the items that match the condition.
# Using filter() to keep even numbers
def is_even(x):
return x % 2 == 0
nums = [1, 2, 3, 4, 5, 6]
evens = filter(is_even, nums)
print(list(evens)) # [2, 4, 6]
The same logic can be expressed more clearly using a comprehension with an if clause.
# Equivalent comprehension
evens = [x for x in nums if x % 2 == 0]
filter() is convenient when the predicate function already exists or when you need to chain multiple functional operations together.
reduce()
The reduce() function repeatedly applies a binary operation to the elements of a sequence, reducing it to a single cumulative result. It lives in the functools module and is a staple of functional programming.
from functools import reduce
nums = [1, 2, 3, 4]
total = reduce(lambda a, b: a + b, nums)
print(total) # 10
Each step combines two values into one, then feeds that result back into the next call. It is useful for aggregations such as summing, multiplying, or merging structures.
reduce() is powerful, it can be harder to read than explicit loops or comprehensions. Use it for well-known patterns like summation or concatenation, and prefer descriptive helper functions over anonymous lambdas when possible.
Comparing the approaches
All three functions (map(), filter(), and reduce()) represent different aspects of data processing:
map()transforms each elementfilter()selects elements based on a conditionreduce()combines all elements into one
Together with comprehensions, they form a flexible toolkit for transforming and aggregating data in a clear, functional style. The choice between them depends on your priorities: clarity (use comprehensions) or composability (use the functional tools).
# Combining them functionally
from functools import reduce
nums = range(1, 11)
result = reduce(
lambda a, b: a + b,
filter(lambda x: x % 2 == 0, map(lambda n: n * n, nums))
)
print(result) # Sum of even squares: 220
# Equivalent comprehension
result = sum(n * n for n in range(1, 11) if n % 2 == 0)
print(result) # 220
Higher-order functions
A higher-order function is a function that takes another function as an argument, returns a function as a result, or does both. This idea lies at the heart of functional programming. By passing functions around like data, you can write code that is flexible, composable, and expressive.
# A simple higher-order function
def apply_twice(func, value):
return func(func(value))
def increment(x):
return x + 1
print(apply_twice(increment, 3)) # 5
Here, apply_twice() takes a function (func) and a value. It applies the function twice to the value and returns the result. This ability to treat functions as first-class objects lets you build behaviour dynamically.
Passing functions as arguments
Because functions are objects, you can pass them to other functions that expect callables. Many built-in tools rely on this pattern, including map(), filter(), and sorting functions such as sorted() with a key parameter.
# Using a function as a sorting key
def by_length(word):
return len(word)
words = ["apple", "fig", "banana", "kiwi"]
print(sorted(words, key=by_length))
# ['fig', 'kiwi', 'apple', 'banana']
Anonymous functions (created with lambda) make this pattern even more concise, especially for short, single-use operations.
# Equivalent using lambda
words = ["apple", "fig", "banana", "kiwi"]
print(sorted(words, key=lambda w: len(w)))
key or predicate parameter are higher-order. This includes min(), max(), sorted(), map(), filter(), and others.
Returning functions
A higher-order function can also return another function. This pattern is often used to generate customised behaviour or to wrap existing functions with additional logic.
# Returning a function
def make_multiplier(factor):
def multiply(x):
return x * factor
return multiply
double = make_multiplier(2)
triple = make_multiplier(3)
print(double(5)) # 10
print(triple(5)) # 15
The returned inner function retains access to the factor variable from its defining scope. This is an example of a closure, which is a key concept that allows functions to remember the environment in which they were created.
Composing functions
Function composition means combining smaller functions into more complex ones. This helps you express multi-step transformations declaratively.
# Function composition example
def compose(f, g):
return lambda x: f(g(x))
def square(x):
return x * x
def increment(x):
return x + 1
combined = compose(square, increment)
print(combined(4)) # (4 + 1)² = 25
Higher-order functions let you build modular and expressive programs by defining operations that describe how to combine functions, not just how to combine data. This approach encourages cleaner abstraction layers and reusable building blocks for complex logic.
Decorators
A decorator is a higher-order function that wraps another function to modify or extend its behaviour, without permanently changing the original function’s code. Decorators are one of Python’s most distinctive and powerful features, allowing you to express reusable behaviour in a clean and readable way.
In essence, a decorator is a function that takes a function as input, returns a new function, and is applied using the @ syntax placed above a function definition.
# A simple decorator
def announce(func):
def wrapper():
print("About to call the function...")
func()
print("Function call complete.")
return wrapper
@announce
def greet():
print("Hello, world!")
greet()
# Output:
# About to call the function...
# Hello, world!
# Function call complete.
Here, @announce is shorthand for greet = announce(greet). When greet() is later called, the wrapper inside announce() executes instead, adding extra behaviour before and after the original call.
Decorators with arguments
Sometimes you want a decorator that accepts its own parameters. In this case, you add an extra level of function nesting. The outer function receives the decorator’s arguments and returns the actual decorator.
# Decorator with arguments
def repeat(times):
def decorator(func):
def wrapper(*args, **kwargs):
for _ in range(times):
func(*args, **kwargs)
return wrapper
return decorator
@repeat(3)
def cheer():
print("Python!")
cheer()
# Output:
# Python!
# Python!
# Python!
The call @repeat(3) constructs a decorator that repeats the wrapped function three times. This pattern makes decorators highly flexible and configurable.
Preserving metadata
Because a decorator replaces one function with another, the resulting wrapper loses the original function’s metadata such as its name and docstring. Python’s functools module provides a utility called wraps() to copy this information automatically.
from functools import wraps
def announce(func):
@wraps(func)
def wrapper(*args, **kwargs):
print(f"Calling {func.__name__}()")
return func(*args, **kwargs)
return wrapper
@announce
def greet():
"""Say hello."""
print("Hello!")
print(greet.__name__) # greet
print(greet.__doc__) # Say hello.
@wraps when writing decorators. It preserves attributes such as the function name, documentation, and module reference, which are important for debugging and introspection.
Stacking decorators
You can apply multiple decorators to the same function by stacking them. The decorators are applied from the bottom up, meaning the one closest to the function runs last.
@announce
@repeat(2)
def greet():
print("Hello!")
greet()
# Equivalent to: greet = announce(repeat(2)(greet))
Decorators can be composed freely, making them useful for layering common behaviours across a codebase. Many Python frameworks use decorators extensively for defining routes, permissions, or data bindings in a declarative style.
Closures and partial application
A closure is a function that remembers the environment in which it was created, even after that scope has finished executing. This allows the function to retain access to variables that were in effect when it was defined. Closures make it possible to build flexible, stateful functions without using classes.
# Example of a closure
def make_counter(start=0):
count = start
def increment():
nonlocal count
count += 1
return count
return increment
counter = make_counter()
print(counter()) # 1
print(counter()) # 2
print(counter()) # 3
Each call to make_counter() creates a new environment with its own count variable. The inner increment() function closes over this variable, allowing it to persist between calls. This makes closures a clean way to encapsulate small bits of state.
Partial application
Partial application means fixing some of a function’s parameters in advance, producing a new function with fewer arguments. Python’s functools module provides the partial() utility to support this directly.
from functools import partial
def power(base, exponent):
return base ** exponent
square = partial(power, exponent=2)
cube = partial(power, exponent=3)
print(square(5)) # 25
print(cube(2)) # 8
This technique is especially useful when working with functions that will be passed as callbacks or mapped over collections. It lets you pre-configure certain parameters to produce simpler, specialised functions.
# Using partial with map()
from functools import partial
def multiply(a, b):
return a * b
double = partial(multiply, 2)
values = [1, 3, 5, 7]
print(list(map(double, values))) # [2, 6, 10, 14]
functools.update_wrapper() or define meaningful function names manually.
Closures and partial functions both exemplify how Python treats functions as data. They allow you to capture context, specialise behaviour, and build new functions dynamically, all key elements of a functional programming mindset.
Built-in functional tools
Python’s standard library includes several built-in functions that reflect functional programming principles. These functions operate cleanly on iterables, avoid side effects, and often work seamlessly with generators or comprehensions. They provide expressive ways to summarise, combine, or transform data without explicit loops.
any() and all()
The any() and all() functions evaluate truth values across an iterable. They short-circuit (stopping early as soon as the result is determined) making them both efficient and readable.
# Using any() and all()
flags = [True, False, True]
print(any(flags)) # True (at least one True)
print(all(flags)) # False (not all True)
These functions are particularly useful for validating data, checking conditions across lists, or combining logical results from generator expressions.
# Check if any name starts with 'A'
names = ["Alice", "Bob", "Clara"]
print(any(n.startswith("A") for n in names)) # True
# Check if all names have at least 3 letters
print(all(len(n) >= 3 for n in names)) # True
any() and all() accept any iterable, including generator expressions. This makes them ideal for streaming validation without building temporary lists.
sum()
The sum() function adds up the numeric values in an iterable. It can also take an optional start value, which allows it to concatenate lists or tuples when the start is an empty container.
# Summing numbers
nums = [1, 2, 3, 4]
print(sum(nums)) # 10
# Summing with a start value
print(sum(nums, 10)) # 20
# Summing lists
lists = [[1, 2], [3, 4], [5]]
print(sum(lists, [])) # [1, 2, 3, 4, 5]
''.join() instead of sum(). String concatenation with sum() is inefficient and not supported by design.
sorted()
The sorted() function returns a new sorted list from any iterable, leaving the original data unchanged. It can take optional parameters key and reverse to customise ordering.
# Basic sorting
nums = [3, 1, 4, 2]
print(sorted(nums)) # [1, 2, 3, 4]
# Reverse order
print(sorted(nums, reverse=True)) # [4, 3, 2, 1]
# Sorting by key
words = ["pear", "apple", "banana", "kiwi"]
print(sorted(words, key=len))
# ['pear', 'apple', 'kiwi', 'banana']
Because sorted() works with any iterable, you can sort results from generators, sets, or even dictionary items without first converting them.
# Sorting generator output
values = (n * n for n in range(5))
print(sorted(values)) # [0, 1, 4, 9, 16]
key parameter is itself a higher-order concept—it accepts a function that transforms each item before comparison, allowing custom sorting logic.
Immutability and expression-driven design
At the heart of functional programming lies the idea of immutability, the principle that data should not change once it has been created. Instead of modifying existing structures, you create new ones that reflect the desired changes. This leads to code that is easier to reason about, test, and debug, because functions produce results without hidden side effects.
In Python, many core data types are already immutable, such as strings, tuples, and frozensets. Mutable types like lists and dictionaries can still be used in a functional style by working on copies or using expressions that return new objects.
# Mutable update (imperative style)
data = [1, 2, 3]
data.append(4)
print(data) # [1, 2, 3, 4]
# Immutable transformation (functional style)
data = [1, 2, 3]
new_data = data + [4]
print(new_data) # [1, 2, 3, 4]
print(data) # [1, 2, 3]
This distinction between mutating and transforming is subtle but powerful. When functions avoid changing external state, their behaviour becomes predictable: given the same inputs, they always produce the same outputs.
Expression-driven design
Functional programming encourages an expression-driven style rather than a statement-driven one. In Python, expressions return values that can be combined and nested to form larger computations, whereas statements perform actions and usually return nothing.
# Statement-driven approach
total = 0
for n in range(10):
total += n * n
# Expression-driven equivalent
total = sum(n * n for n in range(10))
Expression-driven design leads to more declarative code: you describe what should be computed rather than the step-by-step process. This often results in shorter, clearer, and more maintainable programs. However, expressions are not always superior. When logic becomes too complex or performance-sensitive, traditional statements and loops may be clearer. The goal is to find balance by using expressions where they enhance clarity, and statements where they aid understanding.
Python does not enforce immutability or a purely functional approach, but it allows you to write in that style when it suits the task. By favouring immutable data, pure functions, and expression-based logic, you can write programs that are more predictable, modular, and easy to test.
map(), filter(), reduce(), sorted(), and comprehensions) to express complex transformations succinctly and safely.
Chapter 15: Essential Standard Library Modules
Python arrives with a rich standard library. This built-in toolkit reduces the need for external dependencies, improves portability, and encourages consistent patterns across projects. You can reach for well tested modules that solve common problems, then focus on your application logic rather than reinventing utilities.
This chapter introduces modules that cover everyday tasks: text processing, dates and times, data structures, maths and statistics, file and path handling, simple networking, concurrency, and basic system integration. Each section explains what a module is for, shows the core ideas, and includes small examples that you can adapt.
The standard library grows with the language and evolves as community needs change. Some modules prioritise clarity and safety, while others expose lower level building blocks for performance or fine control. The aim here is to help you recognise when to reach for the library, how to read its documentation effectively, and how to use its pieces together in clean, idiomatic code.
What this chapter covers
You will see practical slices of the library across a few themes: structured data (collections, itertools, functools), text and parsing (re, json, csv), time and numbers (datetime, math, statistics, random), files and paths (pathlib, os), command line and logging (argparse, logging), networking basics (urllib, http), and simple concurrency (concurrent.futures, asyncio). The goal is not exhaustive reference, rather a guided tour with patterns you can use immediately.
Keep the Python docs nearby for deeper details. Learn the small number of idioms that appear again and again, then compose them with confidence.
math, random, and datetime
Python’s math, random, and datetime modules cover three essential domains: numerical computation, randomness, and working with dates and times. Together they provide the building blocks for simulations, analysis, and scheduling without the need for external libraries.
math
The math module supplies fast, precise mathematical functions that operate on real numbers. You’ll find constants like pi and e, along with functions for trigonometry, logarithms, rounding, and square roots.
import math
print(math.sqrt(16)) # 4.0
print(math.sin(math.pi/2)) # 1.0
print(math.log(100, 10)) # 2.0
These functions use double precision floating point arithmetic and are implemented in C, which makes them efficient and consistent across platforms.
cmath module. It mirrors math but supports imaginary values.
random
The random module provides pseudo-random number generation for general use. It can select random items, shuffle sequences, or generate floating point numbers in a range.
import random
print(random.random()) # random float 0.0 ≤ x < 1.0
print(random.randint(1, 6)) # integer between 1 and 6
print(random.choice(['red', 'green', 'blue']))
# pick an item
Behind the scenes it uses the Mersenne Twister algorithm, which is fast and statistically solid for non-cryptographic purposes.
secrets rather than random.
datetime
The datetime module brings structured time handling to Python. It defines objects that represent points in time (datetime), dates (date), times (time), and durations (timedelta).
from datetime import datetime, timedelta
now = datetime.now()
print(now)
tomorrow = now + timedelta(days=1)
print(tomorrow.strftime("%Y-%m-%d"))
Formatting and arithmetic are both straightforward. You can convert between timestamps, strings, and datetime objects, and perform date arithmetic cleanly.
datetime.fromisoformat() and .isoformat() for easy exchange of date data between systems.
itertools
The itertools module contains fast tools for building and combining iterators. These functions operate lazily, producing values on demand. They let you express loops and combinatorial logic cleanly and efficiently.
import itertools
nums = [1, 2, 3]
doubled = itertools.chain(nums, nums)
for n in doubled:
print(n, end=" ")
# 1 2 3 1 2 3
Other useful functions include count(), cycle(), repeat(), combinations(), and permutations(). They are designed to reduce memory use and improve readability when processing sequences.
for combo in itertools.combinations('abc', 2):
print(combo)
# ('a', 'b')
# ('a', 'c')
# ('b', 'c')
itertools functions produce infinite sequences, always pair them with islice() or a loop that stops. Think of itertools as “loop algebra.” Instead of writing nested for loops, compose iterators like chain(), zip_longest(), and product() for clear, declarative iteration.
os, sys, and pathlib
Python provides several modules for interacting with the operating system and filesystem. The most fundamental are os, sys, and pathlib. Together they allow you to inspect your environment, manage files and directories, and work with system-level information in a platform-independent way.
os
The os module acts as a bridge between Python and the underlying operating system. It includes functions to list directories, create and remove files or folders, and access environment variables.
import os
print(os.name) # e.g. 'posix' or 'nt'
print(os.getcwd()) # current working directory
print(os.listdir('.')) # list files in current folder
os.mkdir('example')
os.rename('example', 'renamed_example')
os.environ to read and write environment variables, such as API keys or configuration settings, at runtime.
Most functions in os have cross-platform behaviour, although exact details may differ slightly between Windows, macOS, and Linux.
os.path.join() or, better, pathlib.Path objects for safety and readability.
sys
The sys module gives you access to the Python runtime itself. You can read command-line arguments, check interpreter settings, or exit cleanly from scripts.
import sys
print(sys.version)
print(sys.platform)
print(sys.argv) # list of command-line arguments
if len(sys.argv) < 2:
sys.exit("Usage: script.py <filename>")
sys.path lists the directories Python searches for modules. Modifying it temporarily can help with testing or custom imports, but avoid hard-coding such paths in production code.
sys.stdin, sys.stdout, and sys.stderr can be redirected for advanced input/output control.
pathlib
pathlib modernises filesystem handling by replacing the older os.path interface with object-oriented path manipulation. It uses the Path class to represent files and directories in a uniform way across operating systems.
from pathlib import Path
p = Path('.')
for entry in p.iterdir():
print(entry)
file = Path('data.txt')
print(file.exists())
print(file.resolve())
Path objects can be joined using the / operator, and include convenient methods for reading and writing text or binary files.
text = Path('example.txt')
text.write_text('Hello, world!')
print(text.read_text())
pathlib is ideal for modern codebases because it provides clean syntax, automatic path joining, and seamless support for both local and remote filesystem APIs.
collections and dataclasses
Python’s collections and dataclasses modules extend the basic data structures of the language. They offer patterns that make code cleaner, faster, and more expressive, especially when dealing with structured or repetitive data.
collections
The collections module provides specialised container types that complement lists, tuples, sets, and dictionaries. They are designed for efficiency or readability in common use cases.
namedtuple()– immutable, lightweight class with named fieldsdeque– double-ended queue with fast appends and pops from both endsCounter– counts elements in an iterabledefaultdict– dictionary with a default factory for missing keys
collections.ChainMap is useful for merging multiple dictionaries temporarily without copying them.
from collections import namedtuple, deque, Counter, defaultdict
# namedtuple
Point = namedtuple('Point', ['x', 'y'])
p = Point(3, 4)
print(p.x, p.y)
# deque
dq = deque([1, 2, 3])
dq.appendleft(0)
print(dq)
# Counter
counts = Counter("banana")
print(counts)
# defaultdict
scores = defaultdict(int)
scores["alice"] += 10
print(scores["alice"])
These structures are optimised in C and often outperform pure-Python equivalents, especially in data-heavy scripts or when many insertions and lookups are involved.
collections types, such as Counter and defaultdict, return default values when keys are missing. This can mask logic errors if you expect strict key checking.
dataclasses
The dataclasses module (introduced in Python 3.7) automates the creation of classes that primarily store data. It removes boilerplate by generating __init__, __repr__, and comparison methods automatically.
from dataclasses import dataclass
@dataclass
class Book:
title: str
author: str
pages: int
price: float = 0.0
book = Book("Python 101", "Alice", 250)
print(book)
Fields can have default values, type hints, and post-initialisation hooks. Dataclasses are ideal for representing configuration items, structured results, or records that benefit from easy comparison and printing.
frozen=True to make dataclasses immutable, similar to namedtuples but with full class syntax.
@dataclass(frozen=True)
class Point:
x: int
y: int
Dataclasses also integrate cleanly with type checking, pattern matching, and serialisation libraries such as json and asdict() from dataclasses itself.
from dataclasses import asdict
print(asdict(book))
copy.deepcopy() when needed.
Short Recipes
The power of the standard library often appears in small, elegant combinations. These short recipes show how a few modules can work together to solve real tasks in just a few lines of code. Each example uses only what Python includes by default.
1. Counting words in a text
from collections import Counter
text = "the quick brown fox jumps over the lazy dog"
counts = Counter(text.split())
print(counts.most_common(3))
This counts words using Counter and lists the three most frequent ones. A compact and efficient pattern for analysis or quick summaries.
2. Generating random sample data
import random
from datetime import date, timedelta
names = ["Alice", "Bob", "Charlie"]
start = date(2024, 1, 1)
for name in random.sample(names, len(names)):
joined = start + timedelta(days=random.randint(1, 365))
print(name, "joined on", joined)
This creates randomised user data, combining random with datetime to generate varied output.
3. Walking a directory tree
from pathlib import Path
for path in Path('.').rglob('*.py'):
print(path)
This recursively lists every Python file in the current directory and its subfolders, using pathlib for clear, platform-safe paths.
'*.py' with any pattern such as '*.txt' or '*.json' for quick file filtering.
4. Combining iterators for structured output
import itertools
names = ['Alice', 'Bob', 'Charlie']
scores = [82, 91, 77]
for name, score in itertools.zip_longest(names, scores):
print(f"{name}: {score}")
Iterator tools like zip_longest() let you combine uneven sequences safely, a common pattern in data handling or reporting scripts.
5. Simple configuration object
from dataclasses import dataclass
@dataclass
class Config:
user: str
theme: str = "light"
autosave: bool = True
cfg = Config("robin")
print(cfg)
A dataclass provides a clean and readable structure for small configuration or settings containers.
6. Finding the largest files in a folder
from pathlib import Path
files = sorted(Path('.').glob('*.*'), key=lambda f: f.stat().st_size, reverse=True)
for f in files[:3]:
print(f.name, f.stat().st_size, "bytes")
This uses pathlib and sorting with a key function to show the biggest files at a glance.
Chapter 16: Working with Data and Text
Modern programming often revolves around data: collecting it, transforming it, and sharing it in readable or structured forms. Whether that data comes from files, network responses, or user input, Python’s design makes it unusually well suited for this work. Its flexible syntax and built-in tools let you move easily between plain text and structured data such as JSON, CSV, or tabular formats.
This chapter explores how Python reads, interprets, and reshapes data in practical ways. It builds on the basics of file I/O introduced earlier and focuses on text handling, parsing, and transformation techniques. You will learn how to use built-in string methods, the re module for regular expressions, and simple iterator patterns to extract meaning and structure from raw input.
Data and text are deeply connected. Text often carries structured information (logs, reports, configuration files) that must be cleaned and parsed before it becomes usable. Python provides consistent patterns for doing so without heavy dependencies: small, readable functions that work well together. Once you grasp these, you can handle many real-world data formats with little more than the standard library.
We will also look at patterns for transforming data—filtering, mapping, grouping, and aggregating, using core features like comprehensions and itertools. The aim is to move from reading text to producing clean, structured results that other code or systems can consume.
Beyond Simple File I/O
By now you have seen how to open files, read or write their contents, and use context managers to ensure that resources are properly closed. Those patterns cover most everyday tasks, but real data rarely arrives as perfectly formatted text. It may contain inconsistent separators, nested structures, or a mixture of encodings. Moving beyond basic file handling means learning how to adapt to those differences and prepare data for further processing.
Python’s standard library makes this easier than it sounds. Modules like csv and json handle structured data, while the built-in string tools let you clean and normalise raw text. You can open files line by line to process large inputs efficiently, or use generators to stream data without loading everything into memory. The key is to combine readability with robustness, writing code that works even when the input isn’t perfect.
# Reading a large text file safely and efficiently
with open("data.txt", encoding="utf-8") as f:
for line in f:
line = line.strip()
if not line:
continue
print(line)
This approach reads one line at a time and removes stray whitespace. It’s memory-efficient and resilient to irregularities. Most text files can be processed this way, whether they contain logs, lists, or tab-separated data.
encoding="utf-8". It ensures consistent behaviour across systems and avoids subtle errors when files include non-ASCII characters.
Structured formats like JSON or CSV add a small layer of complexity but are still handled with concise, declarative code. Here is a short refresher that also demonstrates how to work with nested or irregular data.
import json
from pathlib import Path
data = json.loads(Path("users.json").read_text(encoding="utf-8"))
for user in data:
print(user["name"], "-", user.get("email", "no email"))
When working with CSV files, the csv module supports variable delimiters and flexible field quoting. It is common to encounter non-standard files, so adapting the reader is part of professional practice.
import csv
with open("data.csv", newline="", encoding="utf-8") as f:
reader = csv.reader(f, delimiter=";")
for row in reader:
print(row)
read() unless the data is genuinely small. Iterating over lines or using generators scales much better and keeps programs responsive even with large datasets.
Regular Expressions with re
When simple string methods are not enough, regular expressions provide a powerful way to search, extract, and transform text based on patterns rather than fixed substrings. Python’s re module implements this capability with concise syntax and efficient matching routines. It allows you to describe text structures (words, numbers, dates, tags) and process them in a single pass.
A regular expression, or regex, is a mini-language for describing text patterns. Even a short expression can match a wide range of possibilities, making it ideal for cleaning data, validating input, or parsing semi-structured text such as logs and configuration files.
import re
text = "Email: alice@example.com, Phone: +44 1234 567890"
emails = re.findall(r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}", text)
print(emails)
Here, findall() returns every substring that fits the pattern, an email address in this case. Patterns are written as raw strings (prefixed with r) so that backslashes are interpreted literally rather than as escape sequences.
Basic operations
re.search(pattern, string)– find the first match anywhere in the textre.match(pattern, string)– match only at the start of the textre.findall(pattern, string)– return all matches as a listre.sub(pattern, repl, string)– replace matches with new text
log = "2025-10-22 ERROR Disk full"
m = re.search(r"(\d{4}-\d{2}-\d{2})\s+(\w+)", log)
if m:
date, level = m.groups()
print("Date:", date, "Level:", level)
This pattern captures two groups: a date and a word. Parentheses define capture groups, which you can retrieve later with groups() or by index. Regular expressions are terse but extremely expressive once learned.
Cleaning and substitution
text = "Price: $1,299.00"
cleaned = re.sub(r"[^0-9.]", "", text)
print(cleaned) # 1299.00
Substitution with re.sub() is common in data cleaning: removing punctuation, trimming noise, or normalising formats. Each substitution can apply globally or selectively with flags like re.IGNORECASE or re.MULTILINE.
Precompiled patterns
For repeated operations, compile the pattern once for speed and clarity:
pattern = re.compile(r"\b\w{4}\b")
words = pattern.findall("This text has many four word units")
print(words)
Compiled expressions store settings and optimise matching internally. They also support methods like fullmatch() (entire string match) and split() (tokenise text based on pattern boundaries).
re.VERBOSE to write readable multi-line patterns with comments. It improves maintainability when patterns become complex.
pattern = re.compile(r"""
([A-Za-z]+) # first word
\s+ # one or more spaces
(\d{4}) # a four-digit number
""", re.VERBOSE)
m = pattern.search("Invoice 2025")
print(m.groups())
split(), replace(), startswith()) when the structure is fixed or predictable.
Simple Parsing Patterns
Not all text needs the full power of regular expressions. Many files and logs follow simple, predictable formats that can be parsed with built-in string and iteration methods. Python’s clean text functions (split(), partition(), strip(), and join()) make it easy to extract and restructure data with minimal effort.
Parsing in this sense means turning raw strings into structured information: lists, dictionaries, or named tuples that your program can process further. The goal is clarity and reliability rather than complex pattern matching.
Splitting and partitioning
line = "id=42;name=Alice;score=91"
parts = line.split(";")
fields = dict(p.split("=") for p in parts)
print(fields)
This splits the line into key–value pairs and builds a dictionary in one step. It works because the separators are consistent and simple. For irregular formats, partition() offers a precise alternative that splits once and returns a tuple of three elements: before, separator, and after.
entry = "user: bob@example.com"
label, sep, value = entry.partition(":")
print(label.strip(), "→", value.strip())
Line-based parsing
When dealing with structured logs or configuration files, processing one line at a time keeps the logic simple and memory use low.
lines = [
"2025-10-22 OK",
"2025-10-22 ERROR Disk full",
"2025-10-22 OK"
]
records = []
for line in lines:
date, status, *message = line.split(maxsplit=2)
records.append({
"date": date,
"status": status,
"message": message[0] if message else ""
})
print(records)
Using the star operator (*) collects any extra fields without causing unpacking errors. This approach is both readable and flexible for small parsing tasks.
Combining parsing with filtering
log = """
INFO: Starting process
WARNING: Disk space low
ERROR: Operation failed
INFO: Shutting down
""".strip().splitlines()
errors = [line for line in log if line.startswith("ERROR")]
print(errors)
Simple filtering like this often replaces complex regular expressions. When a format is regular, clarity should take precedence over compactness.
Hybrid approach: simple + regex
For semi-structured text, a mix of string methods and regular expressions often works best. Use string logic for overall structure and regex for fine-grained extraction within a field.
import re
log_entry = "User Alice (id=42) logged in at 10:45"
name_part, _, rest = log_entry.partition("(")
user = name_part.replace("User", "").strip()
user_id = re.search(r"id=(\d+)", rest).group(1)
print(user, user_id)
This pattern keeps the overall logic readable while using a small regex for precision. Together, these tools let you handle most text-based formats you will encounter in everyday scripts or data pipelines.
Data Transformation and Filtering
Once data is loaded or parsed, the next step is often to reshape, clean, or summarise it. In Python, these tasks are naturally expressed through comprehensions, higher-order functions like map() and filter(), and built-in tools from the itertools module. Each of these helps you express transformation logic clearly, without unnecessary loops or temporary variables.
Data transformation usually means taking one collection of values and producing another: filtering out unwanted entries, converting types, or computing new structures from existing ones. Python’s functional and declarative style makes these transformations concise and readable.
Filtering data
scores = [72, 85, 90, 47, 66, 100]
passed = [s for s in scores if s >= 70]
print(passed)
List comprehensions are the most direct way to filter and project data in a single expression. They are equivalent to loops but shorter and easier to reason about.
# The same using filter()
passed = list(filter(lambda s: s >= 70, scores))
The filter() function pairs naturally with lambda expressions or named functions when you want to reuse logic across transformations.
Mapping and conversion
names = ["alice", "bob", "charlie"]
uppercased = [n.title() for n in names]
print(uppercased)
The map() function performs the same pattern procedurally:
uppercased = list(map(str.title, names))
Mapping functions apply a transformation to every element of an iterable. Because they return iterators, they can chain neatly into generators and pipelines without holding everything in memory.
Combining map and filter
values = ["10", "x", "25", "7", "NaN", "40"]
def valid_int(x):
return x.isdigit()
cleaned = [int(x) for x in values if valid_int(x)]
print(cleaned)
This pattern (filter first, then map) appears in many data-cleaning tasks. Filtering ensures you only transform valid inputs, avoiding runtime errors from invalid data.
Grouping and aggregating
Some transformations summarise or organise data rather than change individual items. The itertools.groupby() function groups adjacent items that share a key, making it ideal for simple aggregation.
import itertools
records = [
("A", 10), ("A", 20), ("B", 5), ("B", 15), ("B", 10)
]
for key, group in itertools.groupby(sorted(records), key=lambda r: r[0]):
total = sum(value for _, value in group)
print(key, total)
Grouping is especially useful after sorting data by a key. It allows you to compute totals, averages, or frequency tables with minimal code.
Chaining transformations
When multiple transformations need to occur in sequence, generator expressions and itertools utilities keep memory use low and pipelines expressive.
import itertools
nums = range(10)
pipeline = (n**2 for n in nums if n % 2 == 0)
print(list(itertools.islice(pipeline, 5)))
This pattern is both lazy and efficient: it computes only what is needed, when it is needed. This is how large datasets can be processed in streaming fashion without intermediate storage.
Sorting and restructuring
The sorted() function can be combined with key functions for flexible ordering and restructuring of data.
data = [
{"name": "Alice", "score": 82},
{"name": "Bob", "score": 91},
{"name": "Charlie", "score": 77}
]
sorted_data = sorted(data, key=lambda d: d["score"], reverse=True)
for d in sorted_data:
print(d["name"], d["score"])
Sorting by computed keys, reversing order, or chaining with transformations gives you complete control over how results are structured. These same patterns extend to tuples, dictionaries, and custom classes.
Applied “Recipes”
Data handling in Python often comes down to combining small, clear techniques into useful workflows. The following short “recipes” show how to apply the patterns from this chapter to solve practical problems, such as converting between formats, extracting information, and cleaning or summarising text data.
1. Converting JSON to CSV
import json, csv
# Read structured JSON data
with open("users.json", encoding="utf-8") as f:
users = json.load(f)
# Write a CSV file with selected fields
with open("users.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.writer(f)
writer.writerow(["name", "email"])
for u in users:
writer.writerow([u["name"], u.get("email", "")])
This combines json and csv to convert structured records into a portable tabular form. Each user dictionary becomes a CSV row, with safe defaults for missing keys.
json.loads() on each record rather than loading the whole dataset at once.
2. Extracting email addresses from text
import re
from pathlib import Path
text = Path("correspondence.txt").read_text(encoding="utf-8")
emails = sorted(set(re.findall(r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}", text)))
for e in emails:
print(e)
This example reads a text file, finds all unique email addresses using a regular expression, and prints them sorted. Regular expressions remain one of the most concise ways to identify structured tokens inside free-form text.
3. Summarising numeric data
import csv
from statistics import mean
with open("scores.csv", newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
scores = [float(row["score"]) for row in reader]
print("Average score:", round(mean(scores), 2))
print("Highest score:", max(scores))
Here the csv module reads data into dictionaries, which are then transformed with comprehensions and summarised using statistics.mean(). This pattern scales to any numeric column with minimal code changes.
4. Cleaning inconsistent input
import re
lines = [
"Alice : 82",
"Bob, 91",
"Charlie - 77",
]
records = []
for line in lines:
name, score = re.split(r"[:,-]\s*", line)
records.append((name.strip(), int(score)))
print(records)
When data is semi-structured but inconsistent, small regular expressions combined with splitting logic can restore uniformity. Once cleaned, it can easily be written to JSON or CSV for further use.
5. Counting word frequency
from collections import Counter
from pathlib import Path
text = Path("notes.txt").read_text(encoding="utf-8").lower()
words = re.findall(r"[a-z']+", text)
counts = Counter(words)
for word, freq in counts.most_common(10):
print(word, freq)
This reads a text file, tokenises it into lowercase words, and counts occurrences. The Counter class provides a direct and efficient way to summarise unstructured data.
6. Generating a simple report
import csv
with open("sales.csv", newline="", encoding="utf-8") as f:
reader = csv.DictReader(f)
totals = {}
for row in reader:
region = row["region"]
amount = float(row["amount"])
totals[region] = totals.get(region, 0) + amount
for region, total in sorted(totals.items()):
print(f"{region:10} {total:,.2f}")
This aggregates data by region and formats the results neatly. It combines file reading, grouping, and formatted output — a compact example of everyday data transformation in Python.
Chapter 17: Practical Python Examples
By now, you have learned the core of the Python language, its flow control structures, data types, functions, and the way it organises code into modules and packages. You have also explored some of the standard library and learned how Python can read, process, and transform data. In this chapter, we move from the conceptual to the practical.
Python is at its most powerful when applied to real problems. Whether you are automating routine tasks, analysing text, or building simple command-line tools, Python’s clarity and flexibility make it an excellent choice for small scripts and prototypes that often grow into robust tools.
This chapter presents a few complete examples that show how the pieces you have learned fit together. These examples are designed to be instructive rather than exhaustive. They combine standard library features with good coding practices to demonstrate how short, readable programs can achieve useful results.
We will begin with automation scripts that handle files and logs, move on to a simple text analysis and reporting example, and finish with a small command-line utility built with argparse. Each section builds confidence in applying Python to practical, everyday programming tasks.
Script Automation Example
One of Python’s most common uses is automating repetitive or manual tasks. File renaming, directory cleanup, and log processing are all classic examples where a few lines of Python can save hours of tedious work.
In this section we will create a short script that scans a folder, renames files according to a consistent pattern, and writes a simple log of the changes it made. The same principles can then be adapted for many other automation tasks.
# file_renamer.py
import os
from datetime import datetime
# Folder to process
folder = "C:/example/files"
# Generate a timestamp for the log
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
logfile = f"rename_log_{timestamp}.txt"
with open(logfile, "w", encoding="utf-8") as log:
for name in os.listdir(folder):
if name.endswith(".txt"):
new_name = name.lower().replace(" ", "_")
old_path = os.path.join(folder, name)
new_path = os.path.join(folder, new_name)
os.rename(old_path, new_path)
log.write(f"Renamed: {name} → {new_name}\n")
print("Renaming complete. Log written to", logfile)
This script performs a few important tasks:
- Iterates over all files in a given folder using
os.listdir(). - Applies a renaming rule (in this case, converting names to lowercase and replacing spaces with underscores).
- Writes a log file containing a timestamp and a record of each rename.
Automation scripts like this illustrate how Python can interact directly with the filesystem, combining readability and power in a way that encourages experimentation and refinement. Once a script has been tested, it can be scheduled or integrated into larger workflows with minimal effort.
Simple Text Analysis or Report Generator
Python’s strong support for string handling makes it ideal for analysing or summarising text data. Whether you are reviewing logs, generating reports, or scanning through documents, you can use a few built-in tools to extract meaning from plain text.
In this example, we will build a small program that reads a text file, counts the most common words, and produces a short summary report. This kind of script is useful for quick data exploration, keyword analysis, or preparing a summary of large text files.
# text_report.py
from collections import Counter
import re
# Read text from file
with open("sample.txt", "r", encoding="utf-8") as f:
text = f.read().lower()
# Split into words using a simple regex
words = re.findall(r"\b[a-z']+\b", text)
# Count the most common words
counter = Counter(words)
total_words = sum(counter.values())
# Write a short report
with open("report.txt", "w", encoding="utf-8") as out:
out.write("Word Frequency Report\n")
out.write("=====================\n\n")
out.write(f"Total words: {total_words}\n\n")
out.write("Top 10 words:\n")
for word, count in counter.most_common(10):
out.write(f"{word:>10}: {count}\n")
print("Report saved as report.txt")
The program combines a few powerful tools:
re.findall()to extract words from text using a regular expression.Counterfromcollectionsto count frequencies quickly and clearly.- File I/O to produce a readable text report that can be viewed or shared.
This kind of lightweight analysis is often enough to gain insight from unstructured data without relying on external tools. It also highlights Python’s advantage in turning simple ideas into practical, repeatable scripts.
Small Command-Line Utility (using argparse)
Many Python scripts become more useful when they can be run from the command line with arguments. Rather than editing code each time, you can supply options or file names when executing the script. The argparse module from the standard library makes this simple and consistent.
In this example, we will build a small utility that reads a text file and counts lines, words, and characters, similar to the Unix wc (word count) command. It will take an input file name as a required argument, and an optional flag to display detailed output.
# count_tool.py
import argparse
def count_text(path, verbose=False):
with open(path, "r", encoding="utf-8") as f:
text = f.read()
lines = text.splitlines()
words = text.split()
chars = len(text)
if verbose:
print(f"File: {path}")
print(f"Lines: {len(lines)}")
print(f"Words: {len(words)}")
print(f"Characters: {chars}")
return len(lines), len(words), chars
def main():
parser = argparse.ArgumentParser(
description="Count lines, words, and characters in a text file."
)
parser.add_argument("file", help="Path to the text file")
parser.add_argument(
"-v", "--verbose",
action="store_true",
help="Show detailed output"
)
args = parser.parse_args()
lines, words, chars = count_text(args.file, args.verbose)
if not args.verbose:
print(lines, words, chars)
if __name__ == "__main__":
main()
This small tool can be used directly from the terminal. For example:
python count_tool.py sample.txt
or, for more detail:
python count_tool.py sample.txt --verbose
#!/usr/bin/env python3 line at the top on Unix-like systems and make the file executable. This allows users to run the script directly.
The argparse module provides automatic help messages, type checking, and sensible error handling. It also scales well — you can add subcommands, default values, or configuration files later without changing how users run your script.
With this final example, you have seen how Python can automate tasks, analyse data, and provide user-friendly command-line interfaces. Each of these skills builds toward writing maintainable, useful programs that fit naturally into real-world work.
Chapter 18: Next Steps and Resources
Reaching this point means you have covered the core foundations of Python, including the language, its syntax, and the way it is used in practical situations. You now have the tools to write clean, useful, and maintainable scripts. This final chapter looks ahead, pointing to the next stages of learning and the wider Python ecosystem.
Python grows with you. Once you are comfortable writing programs, the next step is learning how to test, structure, and share them. You will also encounter a rich world of tools that help you write cleaner code and automate repetitive development tasks.
In this chapter, we will explore the essentials of testing, introduce formatting and linting tools that keep code readable, and outline the basics of packaging your own projects for reuse or sharing. Finally, we will map out a roadmap toward more advanced Python concepts to guide your continued development as a programmer.
Testing Preview
As your programs grow, verifying that they behave correctly becomes essential. Testing helps prevent regressions, ensures reliability, and gives you confidence when refactoring or adding new features. Python includes a built-in testing framework, and there are also community tools that make testing easier and more expressive.
The standard library’s unittest module provides a structure for writing and running tests. It follows a pattern similar to testing frameworks found in many other languages, where each test is a method inside a class derived from unittest.TestCase.
# test_math_tools.py
import unittest
from math import sqrt
def square(x):
return x * x
class TestMathTools(unittest.TestCase):
def test_square(self):
self.assertEqual(square(3), 9)
self.assertEqual(square(-4), 16)
def test_sqrt(self):
self.assertAlmostEqual(sqrt(9), 3.0)
if __name__ == "__main__":
unittest.main()
Running this file will automatically discover and execute all methods whose names begin with test_. The results show which tests passed or failed, making it easy to detect errors early.
Many developers prefer to use pytest, a third-party testing framework that simplifies test discovery and reduces boilerplate. It allows tests to be written as plain functions and provides clearer output and flexible fixtures.
# test_example.py
from math import sqrt
def test_square():
assert 3 * 3 == 9
def test_sqrt():
assert sqrt(16) == 4
Once pytest is installed, you can run all tests in a project with a single command:
pytest
Testing is a key part of writing dependable Python programs. Even small projects benefit from a few quick checks to confirm that critical functions behave as expected.
Style, Linting, and Formatting Tools
Readable code is maintainable code. While Python’s design encourages clarity, it is still possible for style differences or small mistakes to make code harder to work with. Style guides, linters, and formatters help maintain a consistent standard across projects and teams.
The official guide to Python code style is called PEP 8. It defines conventions such as indentation, naming, and spacing that make Python code look uniform and easy to read. Most editors and tools can check code automatically for PEP 8 compliance.
A linter analyses code for stylistic and logical issues. One of the most widely used linters is flake8, which reports unused variables, bad indentation, and many other small but important issues.
# Run a style check on all Python files in a folder
flake8 .
For automatic code formatting, black has become a popular choice. It reformats code into a consistent style instantly, reducing the need for manual adjustments or debates about spacing and layout.
# Automatically reformat code
black .
Other tools, such as isort, organise import statements, and pylint provides more detailed analysis and configurable style checks. Many developers combine several tools for best results, often running them automatically in their text editor or as part of a version control workflow.
Adopting style and linting tools early is a simple way to make your codebase cleaner, more professional, and easier to maintain in the long run.
Packaging and Publishing Basics
Once your Python code becomes reusable or you want to share it with others, it is time to package it properly. A package lets you distribute your project so that others can install it easily using standard tools such as pip. The most common way to create and publish packages is with setuptools.
A minimal package has a directory structure like this:
re class="tight">mytool/
├── mytool/
│ ├── __init__.py
│ └── core.py
├── README.md
├── pyproject.toml
└── setup.cfg
The pyproject.toml file defines your build system and metadata, while setup.cfg holds configuration details for setuptools.
# pyproject.toml
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
# setup.cfg
[metadata]
name = mytool
version = 0.1.0
author = Robin Example
description = A simple example Python package
license = MIT
[options]
packages = find:
python_requires = >=3.8
With this setup, you can build and test your package locally:
python -m build
This creates distribution files in a dist/ directory. You can install them directly for testing:
pip install dist/mytool-0.1.0-py3-none-any.whl
To share your package publicly, upload it to the Python Package Index (PyPI) using twine:
twine upload dist/*
README.md helps users understand what your package does and how to install it.
Packaging is an important step toward professionalism in Python development. Even for personal projects, packaging your code helps with versioning, reproducibility, and reuse. Once a project is packaged, it can be installed in virtual environments or shared with collaborators just like any other Python library.
poetry and hatch build on setuptools and simplify dependency management and publishing workflows. They are worth exploring once you begin managing multiple projects.
Roadmap to Advanced Python Topics
With the foundations now complete, you are ready to explore more advanced areas of Python. The language supports a wide range of domains, from web development and data science to automation, artificial intelligence, and system programming. The best way to progress is to build real projects while gradually deepening your understanding of the language’s internals and libraries.
Here are some directions to consider as your skills grow:
- Testing and quality assurance: Learn about test coverage, mocking, and continuous integration workflows that automatically run your tests when code changes.
- Advanced object-oriented features: Explore mixins, abstract base classes, and descriptors to gain more control over how classes behave.
- Functional and asynchronous programming: Study iterators, generators, and coroutines, and learn how
asynciohandles concurrent operations in modern Python. - Type checking and static analysis: Tools such as
mypyandpyrighthelp catch type errors before code runs, improving reliability in large projects. - Data handling and analysis: Libraries like
pandasandnumpypower modern data workflows. They are essential for scientific computing and data science. - Web and API development: Frameworks such as
FlaskandFastAPImake it easy to create web applications and services using Python. - Packaging and environments: Learn about virtual environments, dependency pinning with
pip-tools, and reproducible builds using tools likepoetry. - Performance and optimisation: Profiling, caching, and parallel processing help make Python code faster and more efficient.
Beyond these topics lie specialised fields: machine learning with scikit-learn or tensorflow, web scraping with requests and beautifulsoup, and automation using asyncio or selenium. Each of these areas builds naturally on what you now know.
Continue experimenting, refining your skills, and sharing your work. Python rewards curiosity and clarity. The best way to master it is to keep building — each new project will teach you something fresh about the language and about programming itself.
With that, your journey through This is Python reaches its first milestone, but Python itself continues to grow with every release, every project, and every idea you bring to it.
© 2025 Robin Nixon. All rights reserved
No content may be re-used, sold, given away, or used for training AI without express permission
Questions? Feedback? Get in touch