Chapter 1: Introduction to C
C is a small, efficient, and portable systems language. It grew alongside UNIX and went on to shape how we think about programs, memory, and operating systems. This chapter sets the scene, explains why C still matters, and gets you to your first working program.
The origins of C
C’s roots trace to Cambridge in the 1960s with BCPL (Basic Combined Programming Language). BCPL influenced B at Bell Labs. Dennis Ritchie then evolved B into C while he and colleagues developed UNIX. The language and the operating system informed each other; C provided performance and low-level control; UNIX provided a portable systems context that rewarded a simple, consistent language.
BCPL, B, and the C lineage
BCPL used one word-sized type and emphasized portability via an intermediate representation. B carried forward that simplicity. C introduced a richer type system, pointer arithmetic, and a standard library. Together with UNIX, this created a portable toolchain that spread to diverse hardware.
A language designed for systems
C was designed to write operating systems and tools. That goal shaped the language: direct memory access with pointers; well-defined operators; a minimal runtime; and compilation to fast native code. The result is a language close to the machine yet high-level enough for portable software.
{…} or comments such as /* … */, read them as “content omitted here” (not as syntax you need to type verbatim).
Why C endures
C remains relevant because it balances small surface area with expressiveness. The core language is compact; the standard library is focused; compilers are mature and fast. That combination makes C suitable for kernels, drivers, embedded systems, high-performance libraries, and language runtimes.
A compact core you can master
You can learn the syntax and main patterns of C quickly. Mastery then comes from understanding memory, data layout, and the build process. This guide follows that path: language first; memory and pointers next; then files, networking, and cross-platform builds.
Portability by design
Well-written C compiles on Linux, macOS, and Windows with only minor conditional code. The standard library covers essential tasks (I/O, memory, strings, time, math) while leaving platform specifics to opt-in headers.
The structure of a C program
Every C program consists of declarations and definitions, one of which must be a function named main. You include headers with #include to use the standard library. You compile translation units (.c files) and link them into an executable.
A minimal skeleton
Here is a minimal skeleton for a basic C program:
/* file: hello.c */
#include <stdio.h>
int main(void)
{
printf("Hello, world!\n");
return 0;
}
Header directives begin with # and are handled by the preprocessor. The function main returns an integer to the host environment (zero on success). Braces mark blocks. Statements end with semicolons.
Declarations, definitions, and headers
A declaration tells the compiler about a name and its type. A definition allocates storage or provides function code. You put reusable declarations in .h headers and implementations in .c files. For example: int add(int a, int b); declares a function; its definition provides the body with {…}.
.c file; export only what others need in the paired .h. This structure scales from single-file examples to real projects.
Compilers and toolchains
The compiler turns source into machine code; the linker combines compiled objects into an executable; tools like archivers and debuggers complete the toolchain. On Unix-like systems you will most often use gcc or clang. On Windows you can use cl from MSVC; you can also use GCC or Clang via MinGW or WSL.
Common toolchain roles
Here is a list of common toolchain roles:
| Stage | Typical tool | What it does |
| Preprocess | cpp (built into gcc/clang) | Expands #include and macros |
| Compile | gcc, clang, cl | Turns C into machine code objects |
| Link | ld, link (via compiler driver) | Combines objects and libraries to create an executable |
| Archive | ar | Bundles objects into static libraries |
| Debug | gdb, lldb | Runs and inspects your program with symbols |
Compile commands
Here are compile commands for different systems:
# GCC (Linux, macOS, MinGW)
gcc -std=c23 -Wall -Wextra -O2 hello.c -o hello
# Clang (Linux, macOS)
clang -std=c23 -Wall -Wextra -O2 hello.c -o hello
# MSVC (Developer Command Prompt on Windows)
cl /std:c23 /W4 /O2 hello.c
Use warnings aggressively (-Wall -Wextra or /W4). Specify a standard level (-std=c23 or /std:c23) so builds are predictable.
A first “Hello, world” program
Let’s compile and run the minimal program from earlier. Save it as hello.c, then run one of the commands below for your platform.
Using Linux and macOS with GCC or Clang
Use this with Linux and macOS:
gcc -std=c23 -Wall -Wextra hello.c -o hello
./hello
If you prefer Clang, replace gcc with clang. The output should be a single line saying Hello, world!.
Using Windows with MSVC or MinGW
On Windows use this:
:: MSVC (in a Developer Command Prompt)
cl /std:c23 /W4 hello.c
hello.exe
:: MinGW (GCC on Windows)
gcc -std=c23 -Wall -Wextra hello.c -o hello.exe
hello.exe
If the compiler cannot find headers such as <stdio.h>, check your installation path or use a shell environment that provides the toolchain.
hello.c, io.c, math.c). You can verify your toolchain and experiment quickly without affecting project code.
Chapter 2: Setting Up Your Environment
This chapter gets a working C toolchain on your machine, shows how to compile from the command line, introduces small Makefile builds, and points to editor and IDE choices that fit a portable workflow.
Installing GCC on Linux, macOS, and Windows
GCC is available on all major platforms. Install it with the native package manager on Linux, with Apple’s command line tools or Homebrew on macOS, and with MSYS2 or MinGW on Windows. After installation, verify with gcc --version.
Installing under Linux
Here is how to install under some top flavors of Linux:
| Distro | Install |
| Debian, Ubuntu, Mint | sudo apt update && sudo apt install build-essential |
| Fedora | sudo dnf groupinstall "Development Tools" |
| RHEL, AlmaLinux, Rocky | sudo dnf groupinstall "Development Tools" |
| Arch, Manjaro | sudo pacman -S base-devel |
| openSUSE | sudo zypper install -t pattern devel_C_C++ |
gcc (for example make, gdb). This saves time later.
Installing under macOS
Using macOS install like this:
# Apple Command Line Tools (provides Clang; often enough)
xcode-select --install
# Optional: Homebrew GCC alongside Clang
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install gcc
After installing, run gcc --version. On macOS the default gcc is usually Clang’s driver for compatibility. Installing Homebrew gcc gives you a GNU build as well (named gcc-… such as gcc-14).
Installing under Windows
Following are the various Windows installation options:
| Option | Purpose | Notes |
| MSYS2 (MinGW-w64) | GCC toolchains for native Windows |
pacman -S --needed base-devel mingw-w64-ucrt-x86_64-gcc; use the MSYS2 MinGW shell |
| MinGW-w64 standalone | Lightweight GCC for Windows |
Add bin to PATH; verify with gcc --version |
| WSL (Ubuntu on Windows) | Linux userland on Windows | Install Ubuntu from Store; then sudo apt install build-essential |
MSVC, you compile with cl rather than gcc. This guide shows GCC/Clang commands first, with MSVC equivalents when needed.
Using the command line to compile and run programs
The compiler driver handles preprocessing, compiling, and linking. Start with a single-file program, then add warnings and standard selection. Keep commands simple and repeatable.
A single-file build
A single-file build looks like tyis:
# GCC
gcc -std=c23 -Wall -Wextra -O2 hello.c -o hello
# Clang
clang -std=c23 -Wall -Wextra -O2 hello.c -o hello
# Run (Unix-like)
./hello
# Run (Windows)
hello.exe
Use -std=c23 for a modern baseline. Enable warnings with -Wall -Wextra. Add -g for debug symbols when needed.
Separating compiling and linking
This is how you separate compiling and linking:
# Compile to objects
gcc -std=c23 -Wall -Wextra -O2 -c util.c -o util.o
gcc -std=c23 -Wall -Wextra -O2 -c main.c -o main.o
# Link objects into an executable
gcc util.o main.o -o app
Compiling to objects avoids rebuilding every file. The linker step produces the final program.
-pedantic when you want extra standard conformance checks. Add platform libraries at link time with -l<name> (for example -lm for math).
Using Makefiles for simple builds
make automates rebuilds based on file timestamps. A small Makefile turns multi-file commands into named targets. Variables keep flags in one place.
A minimal portable Makefile
An example Makefile might look like this:
# file: Makefile
CC := gcc
CFLAGS := -std=c23 -Wall -Wextra -O2
LDFLAGS :=
TARGET := app
SRCS := main.c util.c
OBJS := $(SRCS:.c=.o)
$(TARGET): $(OBJS)
$(CC) $(OBJS) $(LDFLAGS) -o $(TARGET)
%.o: %.c
$(CC) $(CFLAGS) -c $< -o $@
.PHONY: run clean
run: $(TARGET)
./$(TARGET)
clean:
$(RM) $(OBJS) $(TARGET)
Tabs are required before recipe lines. The pattern rule builds any .o from a matching .c. The run target executes the program on Unix-like systems.
Using Windows
With MSYS2 or Git Bash, the above Makefile works as written. On native Windows shells you may use mingw32-make instead of make, and replace the run recipe with a line that invokes $(TARGET).exe.
CMake. You keep a single project description and produce platform-native build systems.
Working with IDEs and editors
Choose an editor that respects your command-line workflow. You can keep gcc, clang, and make at the core, while the editor provides IntelliSense, code navigation, and debugging.
Using VS Code
Install the C/C++ extension. Use a simple tasks.json to call your Makefile or compiler directly. For cross-platform projects, pair VS Code with CMake Tools to configure and build out of source.
Using CLion
CLion uses CMake as the project model. You write a CMakeLists.txt, then build and debug with the integrated toolchains. CLion can target WSL, MinGW, or remote toolchains.
Other solid choices
A list of other popular options:
| Editor | Notes |
| Visual Studio (MSVC) | Great Windows debugger; projects use cl and MSBuild |
| Code::Blocks | Lightweight IDE; supports GCC and Clang |
| Vim, Neovim, Emacs | Pair with clangd or ccls for completion; keep builds in make or cmake |
Chapter 3: The C Language Basics
This chapter introduces the essential syntax of C: what makes up a program, how functions are defined and called, how to use comments and names, how input and output work, and how expressions form the flow of computation. From here onward, you will start writing code that actually does things.
The structure of a C program
A C program is a set of declarations and function definitions, with one function named main acting as the entry point. Before functions, you may include headers and declare global variables. The general form looks like this:
#include <stdio.h>
int main(void)
{
printf("Program skeleton\n");
return 0;
}
Programs are usually composed of several translation units (.c files) compiled separately and linked together. Each file can include one or more header files (.h) that declare shared interfaces.
Typical source layout
Here's how you might typically layout C source code:
| File | Purpose |
main.c | Contains main() and program logic |
util.c | Contains helper functions |
util.h | Declares helper function prototypes |
Makefile | Describes how to build the program |
Functions and the main() entry point
Functions are the building blocks of a C program. Each function has a return type, a name, and (optionally) parameters. The function main() is special because execution begins there. It can take zero or two arguments and must return an int value.
Defining and calling functions
Functions are defined and called like this:
#include <stdio.h>
int add(int a, int b)
{
return a + b;
}
int main(void)
{
int sum = add(3, 4);
printf("Sum is %d\n", sum);
return 0;
}
The function add() receives two integers, computes a result, and returns it. main() then prints the result using printf(). Every function definition must specify a return type. If there is no meaningful value to return, use void.
Return codes and conventions
By convention, returning 0 from main() indicates success. Any nonzero value signals an error. This pattern lets shell scripts and other programs test whether your program ran successfully.
return statement, the result is undefined. Always return explicitly.
Comments, identifiers, and keywords
Comments document your code for readers (including future you). Identifiers name variables, functions, and other symbols. Keywords are reserved by the language and cannot be used as identifiers.
Comment styles
You have a choice of commenting styles:
/* This is a block comment
that can span multiple lines */
int main(void)
{
// This is a single-line comment
printf("Comments ignored by compiler\n");
return 0;
}
Use comments sparingly but clearly. They should explain *why* code exists, not restate what it does.
Valid identifiers
An identifier can contain letters, digits, and underscores, but it must not start with a digit. Identifiers are case-sensitive. Examples: count, file_index, MAX_SIZE.
Common keywords
Here are some of the more common C keywords:
| Category | Examples |
| Types | int, char, float, double, void |
| Control | if, else, for, while, switch, break, continue |
| Storage | static, extern, register, auto, const, volatile |
| Other | return, sizeof, typedef, struct, enum, union |
integer instead of int) to reduce confusion and improve readability.
Basic I/O and formatted output
C’s standard input and output library, <stdio.h>, provides printf() for writing text and scanf() for reading it. Both use format strings containing placeholders that match variable types.
Printing values
This is how you print values to the screen:
#include <stdio.h>
int main(void)
{
int age = 30;
double pi = 3.14159;
printf("Age: %d\n", age);
printf("Pi: %.2f\n", pi);
return 0;
}
Each % marker is replaced by the corresponding argument. The \n creates a newline. Format specifiers control how values are shown: %d for integers, %f for floating-point numbers, %s for strings, and so on.
Reading input
You can read input like this:
#include <stdio.h>
int main(void)
{
int number;
printf("Enter a number: ");
scanf("%d", &number);
printf("You entered %d\n", number);
return 0;
}
scanf() requires the address of the variable (hence the & operator). If the format and input do not match, undefined results may occur, so validate carefully.
scanf() can be unsafe if not used cautiously. For robust programs, prefer line-based input (fgets() and sscanf()) when processing strings.
Common format specifiers
Here are some of the more common format specifiers:
| Specifier | Meaning |
%d | int |
%ld | long |
%f | float or double (depending on context) |
%.2f | floating-point with 2 decimal places |
%s | null-terminated string |
%c | single character |
printf() output when printing to a terminal; many systems buffer text until a newline is written.
Statements, expressions, and blocks
Statements are the executable steps of a program. Expressions compute values. A block (a group of statements in braces) counts as a single compound statement. Understanding this hierarchy is essential to writing readable and structured C code.
Simple and compound statements
Examples of simple and compound statements:
int x = 10; // simple statement
x = x + 5; // expression statement
if (x > 12)
{
printf("x is greater than 12\n"); // part of a compound block
}
Braces define a block scope. Variables declared inside a block are visible only within it. Indentation and consistent style make nested structures easier to follow.
Expressions and side effects
An expression has a value. An expression statement uses that value or triggers a side effect. For example, x++ increments x as a side effect. Control statements such as if or while depend on expressions that evaluate to true (nonzero) or false (zero).
if (x = 5)), it’s easy to introduce logic bugs. Use parentheses and clear comparisons (==) to avoid mistakes.
Program flow at a glance
Each statement executes in sequence unless control flow alters it. You’ll explore conditionals and loops in the next chapter, but the foundation (expressions forming statements within blocks) is already in place here.
Chapter 4: Data Types and Variables
C programs work with values of specific types. Each type controls how many bytes a value uses, how it is represented in memory, the operations that make sense, and how input and output behave. This chapter introduces the fundamental arithmetic types, the common modifiers and qualifiers, rules for scope and lifetime, and how conversions and promotions occur during expression evaluation.
sizeof at compile time and the macros in <limits.h> and <float.h>.
Primitive types
C defines arithmetic types for integers and real numbers. The core set that appears in almost every program includes char, int, float, and double. There is also long double for higher precision where supported.
char and character representation
char is an integer type large enough to hold any basic execution character. It can be signed or unsigned; this is implementation defined. Character constants like 'A' and escape sequences like '\n' have type int in older code and type int or char depending on context; prefer using unsigned char for raw byte data.
// Inspect the signedness of char at compile time
#include <stdio.h>
int main(void) {
if ((char)-1 < 0) {
printf("char is signed\n");
} else {
printf("char is unsigned\n");
}
return 0;
}
unsigned char. This avoids negative values that can surprise bitwise code.
Integer types built around int
Integer types come in sizes related to short, int, long, and long long. Each can be signed or unsigned. The rank of these types controls promotions and conversions later in this chapter.
| Category | Examples | Typical width | Size check at compile time |
| Small | signed char, unsigned char, short |
8 to 16 bits | sizeof(signed char) == 1 by definition |
| Regular | int, unsigned int |
16 to 32 bits | sizeof(int) is implementation defined |
| Wide | long, unsigned long, long long |
32 to 64 bits | sizeof(long), sizeof(long long) |
int is 32 bits. On some systems long is 32 bits and on others it is 64 bits. Use <stdint.h> types like int32_t and uint64_t when you need exact widths.
Floating types
Floating types usually follow IEEE 754 binary formats. The precision and range are available as macros such as FLT_DIG, DBL_MAX, and LDBL_EPSILON from <float.h>. Conversions between integers and floating values are covered in section c04d.
#include <float.h>
#include <stdio.h>
int main(void) {
printf("float digits: %d\n", FLT_DIG);
printf("double max: %e\n", DBL_MAX);
return 0;
}
Inspecting sizes and limits
Use sizeof and the limits headers to discover properties at compile time and run time.
#include <stdio.h>
#include <limits.h>
#include <float.h>
int main(void) {
printf("sizeof(int) = %zu\n", sizeof(int));
printf("INT_MAX = %d\n", INT_MAX);
printf("DBL_MIN = %e\n", DBL_MIN);
return 0;
}
Type modifiers and qualifiers
Modifiers change the range or precision of a type. Qualifiers change how objects of that type may be accessed or updated. Modifiers bind to arithmetic types. Qualifiers bind to any type.
Using signed and unsigned integers
signed integers represent negative and positive values. unsigned integers represent non negative values only. Conversions between the two are value preserving when the source fits the destination; otherwise results depend on the representation and can wrap modulo two to the power of the width.
#include <stdio.h>
int main(void) {
unsigned int u = 4000000000u;
int s = (int)u; // implementation defined for values that do not fit
printf("%u %d\n", u, s);
return 0;
}
Using const for read only intent
const marks an object as not modifiable through that name. It does not make the value a compile time constant by itself. Pointers to const and const pointers are different forms; the position of const matters.
int x = 10;
const int cx = 20; // cannot write through cx
int *p = &x; // pointer to int
const int *pc = &x; // pointer to const int (data is read only through pc)
int * const cp = &x; // const pointer to int (pointer is fixed)
const int * const cpc = &x; // const pointer to const int
int * const is a const pointer to int. This simple rule reduces confusion with multiple qualifiers.
Using volatile for externally changed data
volatile tells the compiler that a value can change outside the program flow. The compiler then avoids certain optimizations and always performs an actual read or write. This is used for memory mapped device registers, flags set by signal handlers, and shared memory where synchronization occurs across translation units.
extern volatile unsigned int status_reg;
while ((status_reg & 1u) == 0u) {
/* wait for ready flag set by hardware … */
}
volatile does not provide atomic operations or mutual exclusion. Use appropriate synchronization primitives for concurrency needs.
Variable scope and lifetime
Scope tells where a name is visible. Storage duration tells when an object exists in memory. Linkage tells whether a name refers to the same object across translation units. All three affect how you design interfaces and manage memory.
Types of scope
Block scope applies to identifiers declared inside braces. Function scope applies to labels used with goto. File scope applies to identifiers declared at the top level of a translation unit.
int g = 1; // file scope, external linkage by default
void f(void) {
int x = 2; // block scope, automatic storage
{
int x = 3; // inner block hides outer x
printf("%d\n", x);
}
printf("%d\n", x);
}
Working with storage duration
Automatic storage duration objects are created on entry to the enclosing block and destroyed on exit. Static storage duration objects exist for the entire program run. Allocated storage is obtained by malloc and friends and must be released by free.
#include <stdlib.h>
void counter(void) {
static int calls = 0; // static storage, value persists
calls++;
/* use calls … */
}
int *make_array(size_t n) {
int *p = malloc(n * sizeof *p);
if (!p) { return NULL; }
return p;
}
sizeof *p over sizeof(type) in allocations. This stays correct if the pointed type changes.
Linkage types
Identifiers with external linkage refer to the same entity across translation units. Internal linkage restricts a name to the current translation unit. No linkage means each declaration is a distinct entity.
// file a.c
int shared = 42; // external linkage
static int hidden = 7; // internal linkage
// file b.c
extern int shared; // refers to the same object defined in a.c
static for internal linkage when a function is private to a file.
Type conversions and promotion rules
Expressions often combine different types. C applies a set of promotions and conversions to produce a common type. Understanding these rules prevents subtle bugs, especially with signed and unsigned mixes and with floating and integer combinations.
Integer promotions
Types with rank less than int promote to int if int can represent all values of the original type; otherwise they promote to unsigned int. This happens for char, signed char, unsigned char, short, and unsigned short when used in expressions.
unsigned char a = 200;
unsigned char b = 100;
printf("%d\n", a + b); // both promote, addition occurs as int
Usual arithmetic conversions
When binary operators combine different arithmetic types, both operands convert to a common real or integer type. If either operand is floating, the other converts to the widest floating type present. Otherwise both operands undergo integer promotions and then convert to the type with the higher rank and signedness rules.
| If either is | Then convert both to |
long double |
long double |
double |
double |
float |
float |
| Otherwise | apply integer promotions, then convert to the type with higher rank; if ranks are equal and exactly one is unsigned, convert to the unsigned type |
Signed and unsigned interactions
If an unsigned type has rank greater than or equal to the signed type, the signed value converts to unsigned. This can change negative numbers into large positive values.
#include <stdio.h>
int main(void) {
int s = -1;
unsigned int u = 1u;
if (s < u) {
printf("implementation and ranks decide …\n");
}
printf("s + u = %u\n", s + u); // operands convert to unsigned int
return 0;
}
Casts, truncation, and overflow
Explicit casts request a conversion. Converting from a wider integer to a narrower one can truncate. Converting a floating value to an integer rounds toward zero and is undefined if the value is outside the range of the destination type.
#include <math.h>
double d = 3.9;
int i = (int)d; // becomes 3
unsigned char c = (unsigned char)1025; // truncation, keeps low 8 bits
Default argument promotions in variadic calls
Arguments to variadic functions such as printf undergo default promotions. float promotes to double and integer types smaller than int promote to int or unsigned int. Format specifiers must match the promoted types.
#include <stdio.h>
void demo(void) {
float f = 1.0f;
printf("%f\n", f); // ok, f promotes to double
char ch = 'A';
printf("%d\n", ch); // ok, ch promotes to int
}
PRIu64 and related macros from <inttypes.h> for portable printf with fixed width integers. Example: printf("%" PRIu64 "\n", value);
Balancing precision and performance
double often gives better numerical stability than float at a modest cost. When you must interoperate with graphics or signal processing code that expects float buffers, convert at the edges and keep computations in double internally.
Chapter 5: Operators and Expressions
C uses a compact set of operators that combine values into expressions. This chapter groups them into arithmetic; relational; logical; increment and decrement; assignment and compound assignment; and bitwise. You also learn how precedence and associativity affect the result; and why clear parentheses are worth using.
Arithmetic, relational, and logical operators
Arithmetic operators work on numeric types. Relational operators compare values. Logical operators combine Boolean results that are represented by integers in C; zero means false and nonzero means true.
The +, -, *, /, % arithmetic operators
As you would expect + adds; - subtracts; * multiplies; / divides; % gives the remainder for integers. With integers, / truncates toward zero. With floating types, % is not defined.
int a = 7, b = 3;
printf("%d %d %d %d %d\n",
a + b, // 10
a - b, // 4
a * b, // 21
a / b, // 2 (truncates)
a % b); // 1
double x = 7.0, y = 3.0;
printf("%.1f %.1f %.1f\n", x + y, x / y, x * y); // 10.0 2.3 21.0
INT_MAX + 1 has undefined behavior.
Integer division and % edge cases
For signed integers, C defines a / b to truncate toward zero; the remainder follows the rule (a / b) * b + (a % b) == a. If b is zero, the behavior is undefined.
printf("%d %d\n", 7 / -3, 7 % -3); // -2 1
printf("%d %d\n", -7 / 3, -7 % 3); // -2 -1
The <, <=, >, >=, ==, != relational operators
Relational operators compare two operands and produce int results; zero or one. Beware of accidental assignment; write if (a == b) rather than if (a = b).
int a = 5, b = 9;
printf("%d %d %d\n", a < b, a == b, a != b); // 1 0 1
if (0 == x) makes accidental = a compile error in some settings.
The &&, ||, ! logical operators
Logical operators work with truthy or falsy integer values. && and || use short circuit evaluation; the right operand may not be evaluated.
| Expression | Result |
!0 | 1 |
!5 | 0 |
0 && any | 0 |
nonzero || any | 1 |
int calls = 0;
int f(void){ calls++; return 0; }
int x = 1 || f(); // f() not called, short circuit
int y = 0 || f(); // f() called
printf("calls=%d x=%d y=%d\n", calls, x, y); // calls=1 x=1 y=0
Increment and decrement, assignment, and compound assignment
These operators update variables. Pre and post forms of increment and decrement have different values in expressions. Assignment stores a value; compound forms apply an operation and store the result.
Pre- and post- incrementing and decrementing
++i increments then yields the new value. i++ yields the old value then increments. The same idea applies to --.
int i = 3;
printf("%d %d %d\n", ++i, i, i++); // 4 4 4
printf("%d\n", i); // 5
i = i++ + ++i has undefined behavior.
Using = for assignment and chaining
Assignment returns the assigned value; this allows chaining and use in conditions. Use it sparingly for clarity.
int a, b, c;
a = b = c = 42;
if ((a = getchar()) != EOF) { /* use a */ }
The +=, -=, *=, /=, %=, &=, |=, ^=, <<=, >>= compound assignment operators
These combine an operation with assignment; the left operand is evaluated only once.
int x = 5;
x += 3; // 8
x *= 2; // 16
x <<= 1; // 32
Lvalues and rvalues in assignment
The left operand of = must be a modifiable lvalue. You cannot assign to an rvalue or to something declared with const.
int x;
const int y = 3;
/* 3 = x; // not allowed */
/* y = 4; // not allowed */
x = y; // allowed
Bitwise operations
Bitwise operators work on the bit patterns of integer types. Use unsigned types when you care about precise shifts and masks.
The &, |, ^, ~, <<, >> operators
& computes bitwise and; | computes bitwise or; ^ computes bitwise exclusive or; ~ flips all bits; << shifts left; >> shifts right.
unsigned u = 0b0001'0110; // 0x16
printf("and=%02X or=%02X xor=%02X not=%02X\n",
(u & 0x0F), (u | 0x01), (u ^ 0xFF), (unsigned)~u);
printf("shl=%02X shr=%02X\n", u << 1, u >> 1);
unsigned for shifts.
Working with masks and flags
Masks select or update specific bits. Define named masks to keep intent clear.
enum {
FLAG_READ = 1u << 0,
FLAG_WRITE = 1u << 1,
FLAG_EXEC = 1u << 2
};
unsigned perms = 0;
perms |= FLAG_READ | FLAG_WRITE; // set bits
perms &= ~FLAG_WRITE; // clear bit
int can_exec = (perms & FLAG_EXEC) != 0; // test bit
unsigned and provide helper functions; this keeps call sites small and expressive.
Right shift behavior
Right shift on unsigned performs a logical shift; zeros fill from the left. Right shift on signed may perform an arithmetic shift that keeps the sign bit, or a logical shift; this is implementation defined.
unsigned uu = 0xF0u; // 11110000
int ss = -16; // representation is two's complement on most systems…
printf("%u\n", uu >> 2); // 00111100
/* printf("%d\n", ss >> 2); // arithmetic or logical; implementation defined */
Operator precedence and associativity
Precedence chooses which operators group more tightly; associativity chooses how operators of the same precedence group. Parentheses always make the intent explicit.
Summary table
| Higher → lower | Operators | Associativity |
| Postfix | () [] p->m p.m x++ x-- | left |
| Unary | ++x --x + - ! ~ (type) * & sizeof | right |
| Multiplicative | * / % | left |
| Additive | + - | left |
| Shifts | << >> | left |
| Relational | < <= > >= | left |
| Equality | == != | left |
| Bitwise and | & | left |
| Bitwise xor | ^ | left |
| Bitwise or | | | left |
| Logical and | && | left |
| Logical or | || | left |
| Conditional | ?: | right |
| Assignment | = += -= *= /= %= &= ^= |= <<= >>= | right |
| Comma | , | left |
Using parentheses for clarity
Parentheses remove ambiguity and document intent. This is especially helpful when mixing relational, logical, and bitwise operators.
int a = 2, b = 3, c = 4;
int r1 = a + b * c; // 14
int r2 = (a + b) * c; // 20
int r3 = (a & b) == 2 && c; // clear grouping
Precedence versus evaluation order
Precedence does not fix the order in which subexpressions are evaluated. In C, the order of evaluation of function arguments and of many operands is unspecified; do not rely on a particular order.
int f(const char *s){ puts(s); return 1; }
/* Do not assume the call order for these arguments */
int z = f("left") + f("right"); // either print order is permitted
, is a real operator with the lowest precedence; it guarantees left to right evaluation inside a single expression. It is different from the comma that separates function arguments.
Use small expressions with clear parentheses; test limits with unsigned when doing bit tricks; keep undefined or implementation defined corners out of production code.
Chapter 6: Control Flow
Control flow means guiding the path your program takes from statement to statement. In C, you do this with conditionals for choosing, switches for multiway branching, and loops for repeating until a condition changes. This chapter moves carefully from simple decisions to structured repetition, adding clarity tips and small cautions as you go.
if, else, and nested conditionals
Using if and else lets a program choose among alternatives. Conditions in C are integer expressions; zero means false and any nonzero value means true. Keeping conditions small and readable reduces mistakes when you nest them.
Writing clear if tests and shaping branches
Start with short, positive conditions that read like a sentence. Prefer parentheses for clarity when mixing relational and logical operators.
int age = 17;
if (age >= 18) {
puts("adult");
} else {
puts("minor");
}
int x = 3, y = 5;
if ((x < y) && (y < 10)) {
puts("in range");
}
if (count > 0) rather than if (!(count == 0)) when either version would work.
Combining paths with else if ladders
Building a simple ladder often reads better than deeply nested conditionals. Place the most likely or simplest tests first to make fallthrough paths cheaper to read.
if (score >= 90) {
puts("A");
} else if (score >= 80) {
puts("B");
} else if (score >= 70) {
puts("C");
} else {
puts("D or below");
}
Nesting conditionals without losing the plot
Nesting is sometimes necessary, for example when a second decision depends on the first. Keep each block small and prefer early returns inside functions to reduce indentation.
int authorize(int role, int active) {
if (!active) {
return 0;
}
if (role >= 3) {
return 1;
}
return 0;
}
else to the intended if can be ambiguous without braces. Always bracing blocks prevents the dangling-else pitfall.
Leveraging assignment in conditions carefully
Assignments yield a value, which can be useful, but accidental assignment in a test is a common bug. Use extra parentheses or compare against constants first if your codebase prefers it.
int c;
while ((c = getchar()) != EOF) {
/* use c */
}
switch and case
Using switch selects one of many paths based on an integral expression. Cases must be constant expressions and labels must be unique within a switch. A missing break causes execution to continue into the next case.
Selecting with switch and placing break wisely
Write a default case to handle the unexpected. Most cases end with break; when you intentionally share logic, comment the fallthrough.
int ch = getchar();
switch (ch) {
case 'a':
case 'A':
puts("vowel a");
break; /* fallthrough from 'a' to 'A' on purpose */
case 'e': case 'E':
puts("vowel e");
break;
case '\n':
puts("newline");
break;
default:
puts("other");
break;
}
Mapping enums to behavior cleanly
Pairing enum values with a switch yields readable intent. Keeping a default that reports an unknown value helps catch future additions.
enum state { ST_INIT, ST_RUNNING, ST_PAUSED, ST_DONE };
void handle(enum state s) {
switch (s) {
case ST_INIT: puts("init"); break;
case ST_RUNNING: puts("running"); break;
case ST_PAUSED: puts("paused"); break;
case ST_DONE: puts("done"); break;
default: puts("unknown…"); break;
}
}
Declaring variables in cases safely
Declaring at a case label is valid, but you must create a block if initialization should not be skipped by jumping into the case.
switch (mode) {
case 1: {
int tmp = compute();
printf("%d\n", tmp);
break;
}
case 2:
/* other work */
break;
}
Loops: for, while, do-while
Looping repeats a block while a condition holds. Choosing the right loop shape communicates when the test happens and how the loop variable changes.
Counting with for and separating concerns
The for loop brings initialization, continuation test, and step together. Keeping those roles tidy makes off by one mistakes less likely.
for (int i = 0; i < n; i++) {
sum += a[i];
}
size_t for array indices avoids negative values and pairs well with sizeof based calculations.
Guarding with while when the count is not known
while tests first, then runs. It is a natural fit for reading streams, walking lists, and waiting for a condition to change.
int c;
while ((c = getchar()) != EOF) {
putchar(c);
}
Performing at least once with do-while
Use do-while when the body must run before the condition is checked, such as menu loops that prompt, then evaluate the choice.
int choice;
do {
printf("1) go 2) quit\n");
choice = getchar();
/* consume rest of line… */
} while (choice != '2');
Managing multiple expressions in loop headers
Comma expressions allow multiple updates. Use them sparingly and keep each subexpression simple.
for (size_t i = 0, j = n - 1; i < j; i++, j--) {
int tmp = a[i];
a[i] = a[j];
a[j] = tmp;
}
Avoiding pitfalls with evaluation order
The order of evaluation for subexpressions is often unspecified. Keep loop updates free of hidden side effects that depend on a particular order.
/* OK: independent updates */
for (int i = 0; i < n; i++) { /* work */ }
/* Risky: do not combine increments and array writes that assume order */
break, continue, and goto
Controlling loop flow means exiting early or skipping to the next iteration. Use break to leave a loop or switch; use continue to skip the rest of the body; reserve goto for rare structured cleanups.
Leaving loops and switches with break
break exits exactly one enclosing loop or the nearest switch. Exiting nested loops usually needs a flag, an outer test, or a goto to a labeled cleanup.
for (size_t i = 0; i < rows; i++) {
for (size_t j = 0; j < cols; j++) {
if (grid[i][j] == target) {
found = 1;
break; /* leaves inner loop */
}
}
if (found) break; /* leaves outer loop */
}
Skipping work with continue
continue jumps to the next iteration. In a for, it performs the step expression first; in a while or do-while, it rechecks the condition.
for (int i = 0; i <= 100; i++) {
if ((i % 2) != 0) continue; /* skip odds */
printf("%d\n", i);
}
Cleaning up with a single exit using goto
Although unstructured jumps hurt readability, a single forward goto to a cleanup label can simplify resource release on error paths in functions that open several resources.
FILE *f = NULL;
char *buf = NULL;
f = fopen("data.bin", "rb");
if (!f) goto cleanup;
buf = malloc(1024);
if (!buf) goto cleanup;
/* use f and buf… */
cleanup:
if (buf) free(buf);
if (f) fclose(f);
cleanup or fail signals intent. Keep exactly one such label per function to avoid spaghetti flow.
Preferring structured control and reducing exits
Favor structured control: small functions, early returns for error checks, and clear loop conditions. When a loop becomes hard to follow, extract the body to a helper that returns a status and let the caller decide how to proceed.
| Intent | Prefer |
| Leaving one loop | break |
| Skipping to next iteration | continue |
| Unwinding on error in one function | single forward goto to cleanup |
| Exiting multiple levels | flags, early returns, or refactoring |
Choosing the right construct and shaping the code for readers keeps control flow simple; the program then communicates its intent without surprises.
Chapter 7: C Functions
Functions let you divide a program into small, reusable actions. Each function performs one clear task, making code easier to test, reuse, and understand. In C, you define functions by declaring their return type, name, and parameters, then calling them wherever their result or side effect is needed.
Defining and calling functions
Defining a function means specifying what it does and what kind of value it returns. Calling a function means executing it by name and supplying any arguments it requires. Functions can appear before or after main() in a file, but if they appear after, they must be declared beforehand.
Creating reusable functions
A function definition contains a return type, a name, a parameter list, and a body. You can place it before main(), or declare a prototype earlier if it appears below.
#include <stdio.h>
void greet(void) {
puts("Hello from a function!");
}
int main(void) {
greet();
return 0;
}
Adding void in a parameter list states explicitly that a function takes no arguments, which avoids confusion with an unprototyped form.
compute_sum() or display_menu(). This makes code read like a set of clear instructions.
Passing values and receiving results
Functions can accept arguments and return values. Each parameter’s type determines how the argument is passed and interpreted. The return type defines the kind of result produced.
int add(int a, int b) {
return a + b;
}
int main(void) {
int total = add(3, 4);
printf("Result: %d\n", total);
return 0;
}
Organizing code with forward declarations
If you define main() before helper functions, declare them first using prototypes. This allows the compiler to check types and link correctly later.
#include <stdio.h>
int square(int n); /* prototype */
int main(void) {
printf("%d\n", square(5));
return 0;
}
int square(int n) {
return n * n;
}
Arguments and return values
Understanding how arguments and return values behave is central to C’s design. By default, arguments are passed by value, meaning the function receives a copy. To modify a caller’s variable, you must pass a pointer instead.
Passing by value
Each argument is evaluated, and a copy is passed into the function. The original variable remains unchanged.
void increment(int n) {
n++;
printf("Inside: %d\n", n);
}
int main(void) {
int value = 5;
increment(value);
printf("Outside: %d\n", value); /* still 5 */
}
Passing by reference with pointers
When a function must change a caller’s variable, pass its address and use a pointer parameter to modify it directly.
void increment(int *n) {
(*n)++;
}
int main(void) {
int value = 5;
increment(&value);
printf("%d\n", value); /* 6 */
}
Returning useful results
A function can return one value of its declared type. When no result is needed, declare the return type as void. The return statement also ends the function early.
double average(double a, double b) {
return (a + b) / 2.0;
}
Variable scope and static variables
Scope determines where a variable can be seen and used. Local variables exist only within their function. Global variables, defined outside any function, persist throughout the program. The static keyword alters lifetime and visibility rules in subtle but useful ways.
Understanding local and global scope
Local variables live within their block. Global variables live for the duration of the program. When both exist with the same name, the local one hides the global.
int counter = 0; /* global */
void bump(void) {
int counter = 10; /* local shadows global */
printf("%d\n", counter);
}
Preserving values with static locals
A static local variable retains its value between calls but remains visible only inside its function. It is initialized once, before the first call.
void track(void) {
static int count = 0;
count++;
printf("Called %d times\n", count);
}
Restricting visibility with static globals
Declaring a file-level variable or function as static makes it visible only within that translation unit. This helps avoid name clashes across multiple source files.
static int cache_size = 512;
static void reset_cache(void) {
cache_size = 0;
}
Header files and prototypes
Declaring functions in headers makes them accessible to multiple source files. Each source file includes the same header to ensure consistent declarations. Prototypes tell the compiler what arguments and return type to expect before it sees the definition.
Creating and including headers
Header files typically end in .h. They contain prototypes, type definitions, and constants. The corresponding .c file contains the actual implementations.
/* mathutils.h */
#ifndef MATHUTILS_H
#define MATHUTILS_H
int add(int a, int b);
int subtract(int a, int b);
#endif
/* mathutils.c */
#include "mathutils.h"
int add(int a, int b) { return a + b; }
int subtract(int a, int b) { return a - b; }
/* main.c */
#include <stdio.h>
#include "mathutils.h"
int main(void) {
printf("%d\n", add(2, 3));
return 0;
}
#ifndef, #define, and #endif to prevent multiple inclusion errors when several files include the same header.
Using prototypes to catch mistakes
Prototypes let the compiler verify that a function is called with the right number and types of arguments. Without them, C assumes default conversions that can cause subtle runtime bugs.
void report(int code);
int main(void) {
report(5); /* type checked */
return 0;
}
void report(int code) {
printf("Code: %d\n", code);
}
Using recursion
Recursion means defining a function in terms of itself. It works by breaking a problem into smaller pieces that look like the original, each time moving closer to a base case that stops the process. In C, recursion is used for problems with natural hierarchical structure, such as traversing trees or computing factorials.
Defining simple recursive functions
Each recursive function must include a base case that ends the calls and a recursive case that moves toward it. Without a base case, recursion never terminates.
int factorial(int n) {
if (n <= 1) return 1;
return n * factorial(n - 1);
}
int main(void) {
printf("%d\n", factorial(5)); /* 120 */
return 0;
}
Tracing recursive calls and stack depth
Each recursive call pushes a new stack frame. Deep recursion can exhaust the stack if the problem size is large or the base case is too far away. Iteration is safer for very large datasets.
Using recursion for structured data
Recursive techniques shine when data has natural substructure, such as lists or trees. Each call handles one piece and lets recursion handle the rest.
struct node {
int value;
struct node *next;
};
void print_list(struct node *n) {
if (n == NULL) return;
printf("%d\n", n->value);
print_list(n->next);
}
Mastering functions and their relationships (arguments, scope, and recursion) forms the backbone of C programming. With these foundations in place, larger programs can grow naturally through small, well-defined building blocks.
Chapter 8: Handling Arrays and Strings
Arrays group values of the same type under one name; strings are byte arrays that end with a null character. This chapter shows how to declare and initialize arrays, how to work with multidimensional layouts, how C represents strings, and how to avoid the most frequent mistakes that cause crashes or silent data corruption.
Declaring and initializing arrays
Arrays are one of the cornerstones of C programming. They let you store groups of related values of the same type under a single name, with each element accessible by its index. Arrays make it possible to work efficiently with large datasets, sequences of numbers, and fixed collections of items that would otherwise require separate variables. Once defined, an array’s size remains fixed for its lifetime, and understanding how initialization and indexing work is essential for safe and efficient use.
This section explores how to declare arrays of various kinds, how initialization rules differ depending on scope and storage class, and how the compiler treats arrays as pointers in most expressions. Mastery of these concepts is critical before moving on to pointers, strings, or dynamic memory later in the language.
Fixed-size arrays and default initialization
You declare a fixed-size array by placing the length in brackets after the element type. Local (automatic) arrays hold indeterminate values until you assign them; file-scope or static arrays are zero-initialized.
int scores[5]; // automatic; contains indeterminate values
static double temps[3]; // static storage; initialized to 0.0
char flags[8]; // bytes; often used for small sets of markers
Initializing with brace lists
Use a brace list to provide initial values. If you give fewer elements than the length, the remainder becomes zero. If you omit the length, the compiler counts elements for you.
int primes[6] = {2, 3, 5, 7, 11, 13};
int odds[] = {1, 3, 5, 7}; // length is 4
double zeros[4] = {0}; // all four become 0.0
char letters[] = {'A', 'B', 'C', '\n'}; // character constants
Designated initializers (C99+)
Designators set specific indices and leave the rest as zero. This is clear and maintainable when only a few positions matter.
int table[10] = {[0] = 42, [5] = 99}; // others become 0
sizeof to compute element counts at compile time: size_t n = sizeof array / sizeof array[0]; This avoids hard-coded magic numbers.
Indexing, iteration, and bounds
Array indices start at zero and end at length minus one. Loop counters should use an unsigned index type that matches the standard library, usually size_t.
#include <stdio.h>
int main(void) {
int a[] = {10, 20, 30, 40};
size_t n = sizeof a / sizeof a[0];
for (size_t i = 0; i < n; i++) {
printf("%zu: %d\n", i, a[i]);
}
return 0;
}
Arrays decay to pointers
In most expressions an array converts to a pointer to its first element; the main exceptions are when used with sizeof, with unary &, or as a string literal initializing a char array.
int b[3] = {1,2,3};
int *p = b; // decay to &b[0]
size_t sz_array = sizeof b; // size of all elements
size_t sz_ptr = sizeof p; // size of the pointer
Multidimensional arrays
When data naturally forms a grid, table, or higher-dimensional structure, C provides multidimensional arrays. They can represent anything from a 2D image to a 3D matrix of measurements, all stored in a contiguous block of memory. Each dimension adds a layer of indexing that helps you organize complex data logically while maintaining predictable layout and performance.
Although conceptually simple, multidimensional arrays in C require care with declarations and function parameters. Because C stores data in row-major order, the rightmost index changes fastest in memory. Understanding this detail helps you reason about performance and compatibility with other languages and libraries.
Declaring 2D and 3D arrays
C stores multidimensional arrays in row-major order (the rightmost index varies fastest). You specify each dimension in brackets.
int grid[3][4]; // 3 rows, 4 columns
double cube[2][3][4]; // 2 layers, 3 rows, 4 columns
Initializing and accessing
Use nested braces for clarity. Access with a pair (or more) of indices.
int grid[2][3] = {
{1, 2, 3},
{4, 5, 6}
};
int x = grid[1][2]; // 6
Passing multidimensional arrays to functions
For true multidimensional arrays, all sizes except the first must be known to the callee so the compiler can compute element addresses. From C99 onward you can use variable length arrays for this.
#include <stdio.h>
void print_mat(size_t rows, size_t cols, int m[rows][cols]) {
for (size_t r = 0; r < rows; r++) {
for (size_t c = 0; c < cols; c++) {
printf("%d ", m[r][c]);
}
printf("\n");
}
}
index = r * cols + c. This works well with dynamic allocation and libraries.
Pointer-to-array types
A pointer to the first row has type int (*)[COLS]. This differs from int **. Keep the parentheses to bind correctly.
void fill(size_t rows, size_t cols, int (*m)[cols]) {
for (size_t r = 0; r < rows; r++)
for (size_t c = 0; c < cols; c++)
m[r][c] = (int)(r + c);
}
String basics and the standard library
In C, strings are not first-class objects but rather arrays of characters terminated by the special null character '\0'. This convention, inherited from the earliest days of the language, keeps string handling lightweight and compatible with low-level memory operations. However, it also places the burden of safety and correctness on the programmer: every operation must respect the boundaries of the allocated buffer and the presence of the terminator.
The C standard library provides a powerful yet risky set of functions for copying, concatenating, searching, and formatting strings. Used properly, they make text handling straightforward. Misused, they are a leading source of program errors. This section covers how strings are represented, how to use the most important library functions, and how to apply defensive habits to prevent data corruption.
What a string is in C
A string is a sequence of bytes ending with the null character '\0'. The length that strlen reports counts characters before the terminator; the capacity of a buffer must include space for the terminator.
char hello[] = "Hello"; // 6 bytes: 'H' 'e' 'l' 'l' 'o' '\0'
size_t len = strlen(hello); // 5
Array vs pointer string declarations
char s[] = "Hi"; creates a writable array initialized from the literal. char *p = "Hi"; points at a string literal in read-only storage; writing through p is undefined behavior.
char s[] = "Hi";
char *p = "Hi";
// s[0] = 'h'; // OK
// p[0] = 'h'; // undefined behavior
Safe input and output of strings
Use fgets for input (it limits by buffer size) and printf with %s for output. Avoid gets (it was removed) and avoid unbounded scanf("%s", ...).
#include <stdio.h>
int main(void) {
char buf[32];
if (fgets(buf, sizeof buf, stdin)) {
printf("You typed: %s", buf); // may include a newline
}
return 0;
}
Essential string and memory functions
These functions live in <string.h>. Prefer the size-aware variants and always keep track of buffer capacities.
| Function | Purpose |
strlen(s) | Count characters before '\0'. |
strcpy(d, s) | Copy until '\0' (requires enough space in d). |
strncpy(d, s, n) | Copy at most n bytes (may not append '\0'). |
strcat(d, s) | Append s to end of d (requires spare capacity). |
strncmp(a, b, n) | Compare at most n bytes. |
strchr(s, c) | Find first occurrence of character. |
strstr(h, n) | Find substring n in h. |
memcpy(d, s, n) | Copy n bytes (non-overlapping regions). |
memmove(d, s, n) | Copy n bytes (handles overlap). |
memcmp(a, b, n) | Compare n bytes. |
snprintf(d, n, "...") | Format into d with a byte limit. |
snprintf returns the number of bytes it wanted to write (not counting the terminator). If the return value is greater than or equal to the buffer size then truncation occurred.
Building strings correctly
Accumulate text with size checks. Keep one variable for capacity and one for the current length. Reserve one byte for '\0'.
#include <stdio.h>
#include <string.h>
void join_three(const char *a, const char *b, const char *c,
char *out, size_t cap) {
size_t len = 0;
int w = snprintf(out + len, cap - len, "%s", a);
if (w < 0) return;
if ((size_t)w >= cap - len) { out[cap - 1] = '\0'; return; }
len += (size_t)w;
w = snprintf(out + len, cap - len, "%s", b);
if (w < 0) return;
if ((size_t)w >= cap - len) { out[cap - 1] = '\0'; return; }
len += (size_t)w;
(void)snprintf(out + len, cap - len, "%s", c);
}
Common pitfalls with '\0' and buffer overflow
The simplicity of C’s memory model is both its strength and its danger. Since strings and arrays have no built-in bounds checking, forgetting to reserve space for the null terminator or writing past the end of a buffer can lead to undefined behavior, security vulnerabilities, or silent data loss. These mistakes are notoriously easy to make and difficult to detect after the fact.
This section explains the classic errors that occur when handling arrays and strings, why they happen, and how to avoid them through disciplined coding practices. Understanding the difference between array size, string length, and buffer capacity is essential for writing robust C programs that behave predictably in all environments.
Forgetting space for the terminator
The capacity must be at least length plus one. If you allocate exactly the visible characters you will write past the end when you add '\0'.
size_t len = 5;
char *bad = malloc(len); // too small for "Hello"
char *good = malloc(len + 1); // space for '\0'
sizeof vs strlen
sizeof array reports the total storage in bytes for arrays with known size at compile time; strlen walks memory until it finds '\0'. After decay to a pointer, sizeof gives the pointer size; not the array size.
char s[] = "Hi";
size_t a = sizeof s; // 3
size_t b = strlen(s); // 2
char *p = s;
size_t c = sizeof p; // pointer size, often 8
Using strncpy without checking for termination
strncpy does not guarantee a terminator when it truncates. Append one yourself if you rely on C strings.
char dest[8];
strncpy(dest, "Longish", sizeof dest);
dest[sizeof dest - 1] = '\0'; // ensure termination
Unsafe input functions
gets was removed because it cannot limit input. Plain scanf("%s", buf) is unsafe. Use a width with scanf or prefer fgets.
char name[16];
// safer scanf usage with width: leaves one for '\0'
scanf("%15s", name);
Off-by-one in loops and concatenation
When appending, the usable space is capacity minus current length minus one. strncat expects a count of available space minus one because it also adds '\0'.
char buf[10] = "Hi";
size_t cap = sizeof buf;
size_t len = strlen(buf);
size_t avail = cap - len - 1;
strncat(buf, " there", avail); // safe append
Passing arrays to functions without sizes
Because arrays decay to pointers, the callee cannot know how many elements exist. Provide a length parameter for every array parameter.
int sum(const int *a, size_t n) {
int total = 0;
for (size_t i = 0; i < n; i++) total += a[i];
return total;
}
Mixing text and binary operations
String routines stop at '\0'. For arbitrary bytes (including zero) use memcpy, memmove, and memcmp, and track explicit lengths.
unsigned char data[4] = {1, 0, 2, 3};
/* strlen((char*)data) is meaningless here; use sizeof or tracked length */
struct slice { char *ptr; size_t len; }; This pattern reduces many mistakes.
Chapter 9: Managing Pointers and Memory
Pointers are central to the power and flexibility of C. They allow direct access to memory, efficient manipulation of large data structures, and communication between functions through shared references. But they also introduce many of the language’s most difficult problems, including segmentation faults, memory leaks, and subtle logic errors that can be hard to trace. This chapter explores how pointers work, how to use them safely, and how to manage memory dynamically using the standard library.
Understanding pointers and addresses
A pointer is a variable that holds the memory address of another variable or object. Every object in memory occupies one or more bytes, and a pointer gives you a way to locate it. The & operator retrieves the address of a variable, and the * operator dereferences a pointer to access or modify the value stored at that address.
int x = 42;
int *p = &x; // p holds the address of x
printf("%d\n", *p); // prints 42
Pointers have types, and the compiler uses those types to interpret the bytes being accessed. A pointer to int is not interchangeable with a pointer to double because the data sizes differ. Correct typing ensures proper arithmetic, alignment, and interpretation of memory contents.
NULL or 0) points to nothing. Always initialize pointers to NULL when declaring them if you do not yet have a valid address.
Using pointer arithmetic
Pointers can participate in arithmetic, but operations are scaled by the size of the object type they point to. Incrementing an int * moves the pointer forward by sizeof(int) bytes, not by one raw byte. This allows natural iteration through arrays and other contiguous memory blocks.
int data[] = {10, 20, 30};
int *p = data; // same as &data[0]
p++; // now points to data[1]
printf("%d\n", *p); // prints 20
Pointer subtraction is also defined when both pointers refer to elements of the same array. The result is the number of elements between them. Arithmetic on unrelated pointers is undefined behavior and must be avoided.
Arrays and pointers compared
Arrays and pointers are closely related in C. When an array is used in an expression, it normally decays to a pointer to its first element. This is why function parameters that accept arrays are declared as pointer types; they receive the address of the first element, not a copy of the whole array.
void print_all(const int *a, size_t n) {
for (size_t i = 0; i < n; i++) {
printf("%d ", a[i]);
}
printf("\n");
}
int nums[] = {1, 2, 3, 4};
print_all(nums, 4); // passes a pointer to nums[0]
However, arrays are not pointers. Their size and storage duration are determined at declaration, while pointers are independent variables that can change what they point to. The distinction becomes important when using sizeof or when allocating memory dynamically.
sizeof array gives the total byte size of the array, while sizeof pointer gives only the size of the pointer variable itself.
Dynamic memory
Static arrays have fixed size, but sometimes you need memory that grows or shrinks at runtime. The C standard library provides four key functions for this purpose, declared in <stdlib.h>:
| Function | Purpose |
malloc(n) | Allocates n bytes, returns a pointer to uninitialized memory. |
calloc(c, n) | Allocates space for c objects of n bytes each, initializing all bits to zero. |
realloc(p, n) | Changes the size of a previously allocated block, preserving existing data up to the new size. |
free(p) | Releases a block of memory previously allocated. |
#include <stdlib.h>
int *arr = malloc(5 * sizeof(int));
if (!arr) {
perror("malloc failed");
exit(EXIT_FAILURE);
}
for (int i = 0; i < 5; i++) arr[i] = i * 10;
arr = realloc(arr, 10 * sizeof(int));
free(arr);
Dynamic allocation moves storage to the heap rather than the stack. It gives you flexibility at the cost of manual management. Forgetting to call free() results in memory leaks, while freeing memory twice leads to undefined behavior.
free(). Consider setting the pointer to NULL after freeing it to avoid accidental reuse.
Common pointer errors and debugging techniques
Most serious C bugs arise from pointer misuse. The most common include dereferencing a null or uninitialized pointer, using a pointer after freeing its memory, and writing beyond allocated bounds. Such mistakes often cause segmentation faults or data corruption that appears far from the source of the error.
To diagnose pointer problems, use debugging tools such as gdb, memory checkers like valgrind, and compiler sanitizers (-fsanitize=address with GCC or Clang). These tools detect invalid accesses, double frees, and leaks by instrumenting your program’s memory operations.
// Example of a bad pointer bug
int *p;
*p = 5; // undefined behavior: p is uninitialized
-Wall -Wextra -Werror to catch many potential pointer issues at compile time before they become runtime bugs.
Segmentation faults and alignment
Memory in a C program is divided into regions: the stack, the heap, and fixed areas for global data and code. The stack stores automatic variables and function call frames. The heap holds dynamically allocated memory. Stack memory is managed automatically, while heap memory must be explicitly allocated and freed.
A segmentation fault occurs when a program tries to access memory outside its allowed region, such as dereferencing an invalid pointer or writing to read-only space. These errors often trace back to missing checks or improper pointer arithmetic.
Memory alignment ensures that data types begin at addresses suited to their size (for example, a 4-byte integer aligned on a 4-byte boundary). Misaligned access can slow down performance or even crash on certain architectures. The compiler handles alignment automatically for normal variables, but when using raw pointers, you must be cautious to maintain correct alignment.
Chapter 10: Structures, Unions, and Enumerations
As programs grow, it becomes inefficient to manage data as isolated variables. C provides three related features to group, label, and interpret data efficiently: structures combine fields of different types, unions let multiple representations share the same memory, and enumerations create named integer constants. Together, these constructs make C suitable for modeling complex real-world entities, exchanging binary data, and building readable, maintainable code.
Defining and using structs
A struct (structure) groups variables of different types under a single name. Each element inside a structure is called a member. Structures are essential for representing compound data such as points, employees, files, or records. You define a structure type with the struct keyword and access its members using the dot operator (.).
struct Point {
int x;
int y;
};
int main(void) {
struct Point p1 = {10, 20};
printf("(%d, %d)\n", p1.x, p1.y);
return 0;
}
You can declare variables immediately after defining a structure, or separately later. Although struct tags are distinct from typedefs, they often appear together for brevity.
typedef struct {
char name[32];
int age;
} Person;
Person alice = {"Alice", 28};
Nested structures and arrays of structs
Structures can contain other structures or arrays, enabling layered, hierarchical data. This is common in representing entities with sub-parts, such as a rectangle made of points, or an array of people forming a team. Access nested members using chained dots, or arrows (->) if you work with pointers.
struct Point {
int x;
int y;
};
struct Rectangle {
struct Point top_left;
struct Point bottom_right;
};
struct Rectangle r = {{0, 0}, {10, 10}};
printf("Width: %d\n", r.bottom_right.x - r.top_left.x);
Arrays of structures allow compact data tables that you can loop over easily.
struct Player {
char name[16];
int score;
};
struct Player team[3] = {
{"Alice", 40},
{"Bob", 25},
{"Charlie", 50}
};
for (int i = 0; i < 3; i++) {
printf("%s scored %d\n", team[i].name, team[i].score);
}
Working with unions
A union lets different data types occupy the same memory space. Only one member can hold a valid value at a time. This feature is often used in low-level programming when data may be interpreted in multiple ways, such as converting between integers and byte arrays or implementing tagged variants.
union Number {
int i;
float f;
};
union Number n;
n.i = 1065353216;
printf("%f\n", n.f); // prints 1.000000 (bit reinterpretation)
Because all members share memory, the size of a union equals the size of its largest member. Using the wrong interpretation can lead to undefined behavior, so always track which member is active, typically by keeping a separate indicator variable.
Anonymous and nested unions
Since C11, a union may be declared anonymously inside a structure, allowing its members to be accessed directly without an extra name. Nested unions and structs together form compact, flexible representations of mixed data.
struct Value {
enum {INT, FLOAT} type;
union {
int i;
float f;
};
};
struct Value v = {.type = INT, .i = 42};
printf("%d\n", v.i);
Enumerations for symbolic constants
An enum (enumeration) defines a set of named integer constants that make code clearer and easier to maintain. Enumerations are often used to represent categories, states, or modes. By default, the first name starts at zero and each following name increases by one unless given an explicit value.
enum Direction {
NORTH,
EAST,
SOUTH,
WEST
};
enum Direction dir = EAST;
printf("%d\n", dir); // prints 1
typedef enum to avoid writing enum repeatedly. Combined with meaningful names, enumerations make control logic self-documenting.
You can assign explicit values when needed. Enumerations improve readability and make debugging output more meaningful. Since they are integers at the binary level, you can still use them freely in arithmetic or switch statements.
enum Status {
OK = 0,
WARNING = 1,
ERROR = 2
};
Chapter 11: Files and Input/Output
Working with files in C means operating through the standard I/O library. You obtain a FILE* handle with fopen(), read or write using the formatted and unformatted routines, then release resources with fclose(). This chapter focuses on opening files safely, choosing the right reading and writing functions, handling binary data, and detecting errors reliably so programs behave predictably.
NULL on failure with a clear message. This keeps the happy path readable and the error path consistent.
Using file pointers with fopen() and fclose()
The type FILE represents a stream. You work with a pointer to this structure, which is returned by fopen() when a file is opened successfully. On failure fopen() returns NULL. Always check this return before continuing; then close the stream with fclose() when finished.
#include <stdio.h>
int main(void) {
const char *path = "log.txt";
FILE *fp = fopen(path, "w"); /* write text; truncates if exists */
if (!fp) {
perror("fopen");
return 1;
}
fputs("Hello file\n", fp);
if (fclose(fp) == EOF) {
perror("fclose");
return 1;
}
return 0;
}
File modes determine how the stream behaves. Common modes include "r" (read), "w" (write; truncate), "a" (append). Add "+ for update, and add "b" for binary where the platform distinguishes text from binary.
| Mode | Meaning |
"r" | Open for reading |
"w" | Open for writing; create or truncate |
"a" | Open for appending; writes go to end |
"r+" | Open for reading and writing |
"w+" | Read and write; create or truncate |
"a+" | Read and append; create if missing |
"rb", "wb", "ab" | Binary variants |
'\n' to "\r\n" and may treat Ctrl+Z as end of file. Use a "b" mode for portable binary I/O.
Understanding FILE* buffering and performance
Streams are buffered by default. stdin and stdout may be line buffered for terminals; other files are usually fully buffered. You can flush output explicitly with fflush(fp). For special needs you can adjust buffering with setvbuf(), although the defaults are suitable for most programs.
Choosing file modes for safer operations
Pick the narrowest mode that fits your intent. For example, prefer "rb" and "wb" when working with binary formats; prefer "a" if another process may be writing and you want to keep existing content intact. Update modes like "r+" allow reads and writes on the same stream, which requires careful seeking to avoid surprising results.
Reading and writing files
Text I/O comes in two flavors. Unformatted functions like fgets() and fputs() move strings reliably. Formatted functions like fprintf() and fscanf() parse and produce structured text. Favor fgets() for robust line input; then parse with strtol() or sscanf() where helpful.
#include <stdio.h>
#include <string.h>
int main(void) {
char line[128];
FILE *in = fopen("input.txt", "r");
if (!in) { perror("input"); return 1; }
FILE *out = fopen("output.txt", "w");
if (!out) { perror("output"); fclose(in); return 1; }
while (fgets(line, sizeof line, in)) {
size_t n = strlen(line);
if (n > 0 && line[n - 1] == '\n') line[n - 1] = '\0'; /* trim newline */
fprintf(out, "line: %s\n", line);
}
if (ferror(in)) { perror("read error"); }
fclose(in);
fclose(out);
return 0;
}
fscanf() can fail partially and leave the stream position mid token. Check its return count and clear errors carefully. A safer pattern is reading a line with fgets() then parsing fields.
Handling long lines by incrementally reading
If a line can exceed your buffer, read repeatedly and accumulate. Continue until you see a newline or end of file.
#include <stdio.h>
#include <string.h>
int read_line(FILE *fp, char *buf, size_t cap) {
size_t used = 0;
for (;;) {
if (!fgets(buf + used, (int)(cap - used), fp)) {
return used > 0 ? (int)used : -1; /* -1 means no data read */
}
used += strlen(buf + used);
if (used > 0 && buf[used - 1] == '\n') {
buf[used - 1] = '\0';
return (int)(used - 1);
}
if (used == cap - 1) return (int)used; /* truncated line */
}
}
Formatting output predictably with fprintf
Use width and precision specifiers to align columns and constrain output. For example %10s right aligns a string in ten characters; %.2f prints two digits after the decimal.
fprintf(stdout, "%-10s %8d %10.2f\n", "item", 42, 3.14159);
/* result: left aligned name; integer column; fixed two decimal places */
Working with binary files
Binary I/O moves raw bytes without any text translation. You pass a pointer to memory, the size of each object, and the number of objects to transfer. Always verify the number of items read or written; then handle short transfers accordingly.
#include <stdio.h>
#include <stdint.h>
int main(void) {
uint32_t values[3] = { 10u, 20u, 30u };
FILE *fp = fopen("vals.bin", "wb");
if (!fp) { perror("wb"); return 1; }
size_t wrote = fwrite(values, sizeof values[0], 3, fp);
if (wrote != 3) { perror("fwrite"); fclose(fp); return 1; }
fclose(fp);
fp = fopen("vals.bin", "rb");
if (!fp) { perror("rb"); return 1; }
uint32_t readback[3] = {0};
size_t got = fread(readback, sizeof readback[0], 3, fp);
if (got != 3) {
if (ferror(fp)) perror("fread");
fclose(fp);
return 1;
}
fclose(fp);
return 0;
}
Seeking within a file using fseek and ftell
Random access requires moving the file position indicator. Use fseek() with an origin of SEEK_SET, SEEK_CUR, or SEEK_END. Query the position in bytes with ftell().
#include <stdio.h>
long size_of_file(FILE *fp) {
long pos = ftell(fp);
if (pos < 0) return -1;
if (fseek(fp, 0, SEEK_END) != 0) return -1;
long end = ftell(fp);
if (end < 0) return -1;
(void)fseek(fp, pos, SEEK_SET);
return end;
}
ftell() returns a long. Very large files may not fit in long on some platforms. The standard library does not provide a portable 64 bit variant everywhere; consult your platform when handling very large files.
Addressing endianness and structure layout
Writing raw structures with fwrite() is simple; however it can break between compilers due to padding, alignment, and byte order. A portable approach serializes fields individually into a byte buffer using fixed width types like uint32_t then writes that buffer.
#include <stdint.h>
void put_u32be(uint8_t *b, uint32_t x) {
b[0] = (uint8_t)((x >> 24) & 0xFF);
b[1] = (uint8_t)((x >> 16) & 0xFF);
b[2] = (uint8_t)((x >> 8) & 0xFF);
b[3] = (uint8_t)( x & 0xFF);
}
Performing error checking and handling EOF correctly
Every stdio function communicates success or failure. Check these results consistently. For character oriented input fgetc() returns int; a value of EOF indicates end of file or error. Distinguish the two using feof() and ferror(). For block I/O compare the count from fread() or fwrite() with the requested count.
#include <stdio.h>
int copy_file(const char *src, const char *dst) {
FILE *in = fopen(src, "rb");
if (!in) { perror("open src"); return 1; }
FILE *out = fopen(dst, "wb");
if (!out) { perror("open dst"); fclose(in); return 1; }
unsigned char buf[4096];
for (;;) {
size_t n = fread(buf, 1, sizeof buf, in);
if (n > 0) {
size_t m = fwrite(buf, 1, n, out);
if (m != n) { perror("write"); fclose(in); fclose(out); return 1; }
}
if (n < sizeof buf) {
if (feof(in)) break; /* clean end of file */
if (ferror(in)) { perror("read"); fclose(in); fclose(out); return 1; }
}
}
if (fclose(in) == EOF) { perror("close in"); }
if (fclose(out) == EOF) { perror("close out"); }
return 0;
}
while (!feof(fp)). The EOF flag becomes set only after a read attempt goes past the end, so this pattern often processes an extra stale iteration. Drive the loop by successful reads; then check feof() or ferror() when the read returns short.
Reporting errors
When a library call fails it may set errno. Use perror() for a simple message that includes the corresponding text, or call strerror(errno) to format your own messages with context.
#include <stdio.h>
#include <errno.h>
#include <string.h>
void open_or_report(const char *path) {
FILE *fp = fopen(path, "r");
if (!fp) {
fprintf(stderr, "cannot open %s: %s\n", path, strerror(errno));
return;
}
fclose(fp);
}
Practising error handling and defensive programming
Defensive file I/O means validating inputs, checking every return value, and cleaning up reliably even when something fails. Favor a single cleanup block controlled by a status variable; keep ownership clear for each resource you allocate, and release everything once.
#include <stdio.h>
#include <errno.h>
int write_report(const char *path, const char *msg) {
int rc = 1; /* assume failure until success */
FILE *fp = NULL;
if (!path || !msg) return 1; /* validate arguments */
fp = fopen(path, "w");
if (!fp) { perror("fopen"); goto cleanup; }
if (fprintf(fp, "Report: %s\n", msg) < 0) {
perror("fprintf");
goto cleanup;
}
if (fflush(fp) == EOF) { perror("fflush"); goto cleanup; }
rc = 0; /* success */
cleanup:
if (fp && fclose(fp) == EOF) {
perror("fclose");
rc = 1;
}
return rc;
}
For reliability in the presence of partial writes or crashes, write to a temporary file in the same directory, flush its buffers with fflush(), then replace the destination using a platform safe rename. The C standard defines rename(); consult your platform for atomic replacement details. If you must read sensitive input do not echo it; handle buffers carefully; clear sensitive buffers after use where appropriate.
Chapter 12: Preprocessor and Compilation
C code goes through several transformations before it becomes a program you can run. The preprocessor expands macros and includes headers, the compiler turns the resulting translation unit into object code, and the linker resolves external references to build an executable or a library. This chapter focuses on guiding that pipeline by using macros, controlling what gets compiled, organizing headers safely, and understanding how each stage works so build problems are easier to diagnose. We finish by situating the C Standard Library within this pipeline, since its headers and symbols are resolved through the same steps.
Using macros and #define effectively
The directive #define introduces a macro that the preprocessor replaces before compilation. Macros can be simple constants or parameterized templates. Prefer const variables and functions for most logic; reach for macros where you need compile time switches, small inlined expressions, or conditional platform adaptations.
#include <stdio.h>
/* object-like macro */
#define PI 3.14159265358979323846
/* function-like macro with parentheses to avoid precedence pitfalls */
#define SQR(x) ((x) * (x))
/* stringizing and token pasting */
#define STR(x) #x
#define CAT(a,b) a##b
int main(void) {
int CAT(val, 1) = 7; * becomes int val1 = 7; */
printf("%s = %d\n", STR(val1), val1);
printf("Area scale: %f\n", PI * SQR(2));
return 0;
}
Parenthesizing and scoping carefully
Always wrap macro parameters and the whole expansion in parentheses to preserve intent when the macro is used in a larger expression. Avoid side effects in arguments because a macro may evaluate them more than once. Use do { … } while (0) to package multi statement macros safely.
#define LOG(fmt, ...) do { \
fprintf(stderr, "[log] " fmt "\n", __VA_ARGS__); \
} while (0)
Leveraging predefined macros
Compilers define helpful macros such as __FILE__, __LINE__, and feature test macros for platforms and compilers. Use them sparingly to tag messages or select code paths when necessary.
#define HERE __FILE__ ":" STR(__LINE__)
#define STATIC_ASSERT(cond, msg) typedef char static_assert_##msg[(cond) ? 1 : -1]
inline functions for typed behavior where performance matters and keep macros small and predictable.
Conditional compilation
Conditional compilation lets you include or exclude code at preprocess time. This is useful for platform differences, debug builds, and feature flags. Keep conditions centralized and readable; avoid scattering many tiny #if blocks through logic that could instead vary at runtime.
#include <stdio.h>
/* feature switches coming from the compiler command line, for example -DENABLE_VERBOSE=1 */
#ifndef ENABLE_VERBOSE
#define ENABLE_VERBOSE 0
#endif
int main(void) {
#if ENABLE_VERBOSE
printf("Verbose mode active\n");
#endif
#ifdef _WIN32
printf("Windows specific setup\n");
#elif defined(__unix__)
printf("POSIX specific setup\n");
#else
printf("Generic setup\n");
#endif
return 0;
}
Defining symbols at compile time
Pass symbols from your build system to avoid hard coding. With gcc you can use -DNAME=value. This keeps source clean and lets you toggle behavior per build target.
/* compile: gcc -DENABLE_VERBOSE=1 -o app app.c */
#if HAVE_CLOCK_GETTIME over negative ones such as #ifndef NO_TIMERS. Positive tests document what you need rather than what you lack.
Protecting headers and improving modularity
Header files declare interfaces that multiple translation units include. To avoid multiple definition problems and recursive inclusion loops, wrap each header with a unique guard macro. Place declarations in headers and definitions in .c files to keep compile times and dependencies under control.
/* file: mathx.h */
#ifndef MATHX_H_INCLUDED
#define MATHX_H_INCLUDED
#include <stddef.h>
double mean(const double *xs, size_t n);
#endif /* MATHX_H_INCLUDED */
/* file: mathx.c */
#include "mathx.h"
double mean(const double *xs, size_t n) {
double s = 0.0;
for (size_t i = 0; i < n; ++i) s += xs[i];
return n ? s / (double)n : 0.0;
}
/* file: main.c */
#include <stdio.h>
#include "mathx.h"
int main(void) {
double xs[] = {1,2,3};
printf("%.2f\n", mean(xs, 3));
return 0;
}
Avoiding relative tangles
Keep public headers in an include directory and compile with an include path, for example -Iinclude. Use quotes "file.h" for project headers and angle brackets <...> for system headers to document intent.
static inline and understand the implications. Otherwise you will create multiple external definitions at link time.
Understanding the compilation stages
Build tools drive the same three stages even when wrapped by an IDE. Knowing how to invoke them directly makes diagnosing errors much easier. The following workflow uses gcc as an example; the same ideas apply to other toolchains.
| Stage | What happens | Example command |
| Preprocess | Expanding macros, removing comments, inserting headers | gcc -E file.c -Iinclude -DMODE=1 > file.i |
| Compile | Translating preprocessed C into object code | gcc -c file.i -o file.o |
| Assemble | Converting assembly to machine code (often merged with compile) | gcc -S file.c -o file.s |
| Link | Resolving external symbols and producing an executable or library | gcc file.o util.o -lm -o app |
# Build a small program step by step
gcc -E main.c > main.i
gcc -c main.i -o main.o
gcc -c mathx.c -o mathx.o
gcc main.o mathx.o -o app
Diagnosing failures
If errors mention macros or included lines, inspect the preprocessed .i file. If the compiler reports a type mismatch, examine declarations across headers. If the linker reports an undefined reference, check that you compiled every needed .c file and linked the right libraries in the correct order.
gcc main.o -lm. Reversing this may fail to resolve symbols.
The C Standard Library
The Standard Library is a collection of headers and linked implementations that the compiler and linker know how to find. You include its interfaces with directives such as #include <stdio.h>, which the preprocessor expands into declarations. Later, the linker resolves the corresponding symbols by linking in the system libraries, either by default or when you add options such as -lm for math.
| Header | Purpose | Notes |
<stdio.h> | File and stream I/O | Functions like printf(), fopen() |
<stdlib.h> | Memory, conversions, utilities | malloc(), strtol(), qsort() |
<string.h> | Byte and string utilities | memcpy(), strncpy() |
<errno.h> | Error reporting | errno and error codes |
<math.h> | Math functions | May require -lm when linking |
<stdint.h> | Fixed width integer types | uint32_t, int64_t |
<assert.h> | Assertions | Disabled when NDEBUG is defined |
/* compile and link; math sometimes requires -lm */
gcc main.c -o main -lm
Including headers and enabling feature macros
Some library features require defining feature test macros before including headers. This lets you opt into newer interfaces while preserving compatibility. Place the macro at the top of a translation unit or define it in your build system.
#define _POSIX_C_SOURCE 200809L
#include <stdio.h>
#include <string.h>
/* now functions like getline() may be available on your platform */
sqrt, check your link line and add the appropriate library switch.
By recognizing that headers participate in preprocessing and that library symbols are resolved during linking, you can place the Standard Library naturally within the same flow that governs your own modules. Good preprocessing habits and a clear build pipeline make library use predictable and portable.
Chapter 13: Modular Programming
As programs grow, maintaining all code in a single file becomes impractical. Modular programming divides functionality into logical units that can be developed, compiled, and reused independently. In C, this modularity is achieved by combining header files for declarations, source files for definitions, and linkage specifications that control visibility between translation units. Understanding these principles makes projects cleaner, more maintainable, and easier to scale.
Splitting programs into multiple source files
Breaking a program into several .c files allows you to organize related functions and data together. Each file can be compiled separately into an object file, then linked to form the final program. This separation shortens build times and encourages clear boundaries between components.
/* file: util.c */
#include <stdio.h>
void greet(const char *name) {
printf("Hello, %s!\n", name);
}
/* file: main.c */
#include "util.h"
int main(void) {
greet("world");
return 0;
}
/* file: util.h */
#ifndef UTIL_H_INCLUDED
#define UTIL_H_INCLUDED
void greet(const char *name);
#endif
Each source file includes its own header to ensure declarations stay synchronized with definitions. Compilation then proceeds independently, producing object files that are later linked.
gcc -c util.c -o util.o
gcc -c main.c -o main.o
gcc main.o util.o -o app
.c file. This ensures missing prototypes or mismatched declarations cause compiler errors early.
Managing dependencies with Makefiles
Once a project has several modules, a Makefile simplifies builds by tracking dependencies. Each target defines how to produce an object file or executable from its sources. When you change a file, only affected parts rebuild.
# Makefile
app: main.o util.o
gcc main.o util.o -o app
main.o: main.c util.h
gcc -c main.c
util.o: util.c util.h
gcc -c util.c
clean:
rm -f *.o app
Designing clear and reliable header files
Headers describe the interface a module presents to other files. They contain type definitions, macros, and function declarations but never define variables or allocate storage. This separation allows multiple translation units to include the same header safely.
/* file: vector.h */
#ifndef VECTOR_H_INCLUDED
#define VECTOR_H_INCLUDED
#include <stddef.h>
typedef struct {
double *data;
size_t length;
} Vector;
void vector_init(Vector *v, size_t n);
void vector_free(Vector *v);
double vector_dot(const Vector *a, const Vector *b);
#endif
Keeping headers minimal and self-contained
Each header should include everything it needs to compile independently. Use forward declarations rather than full includes where possible, and limit exposure of internal structures. If only a pointer to a type is required, forward declare it instead of including its full definition.
/* file: connection.h */
#ifndef CONNECTION_H_INCLUDED
#define CONNECTION_H_INCLUDED
struct Server; /* forward declaration */
int connect_to(struct Server *srv);
#endif
Controlling visibility with static and extern linkage
Linkage determines whether identifiers in one translation unit are visible to others. By default, functions and variables at file scope have external linkage, meaning they can be used across source files. Marking them static restricts them to the current file. Use extern for declarations that refer to definitions elsewhere.
/* file: counter.c */
#include <stdio.h>
static int count = 0; /* visible only in this file */
void increment(void) {
++count;
printf("Count = %d\n", count);
}
/* file: main.c */
void increment(void); /* or include a header declaring it */
int main(void) {
increment();
increment();
return 0;
}
In this example, count remains private to counter.c while increment() is accessible externally. To share global variables across modules, you declare them with extern in a header and define them once in a single source file.
/* file: globals.h */
#ifndef GLOBALS_H_INCLUDED
#define GLOBALS_H_INCLUDED
extern int global_flag;
#endif
/* file: globals.c */
int global_flag = 1;
Using static for internal helper functions
Static functions are ideal for internal utilities that should not be visible outside their source file. They also enable certain compiler optimizations because the compiler knows the function cannot be called externally.
static void log_message(const char *msg) {
fprintf(stderr, "log: %s\n", msg);
}
Building and linking libraries
Libraries package multiple object files so that you can reuse them without recompiling the source each time. There are two common types: static libraries (.a or .lib) and shared libraries (.so or .dll).
Creating and using a static library
Compile your modules into object files, then archive them into a library with ar. Link against it like any other object file.
# build a static library
gcc -c util.c mathx.c
ar rcs libmylib.a util.o mathx.o
# use the library
gcc main.c -L. -lmylib -o app
Static libraries are copied into the final binary during linking, which makes distribution simple but increases file size.
Building and using a shared library
Shared libraries are loaded dynamically at runtime. They reduce duplication across programs and can be updated independently of the executable. On Unix-like systems they use the .so extension, while Windows uses .dll.
# build a shared library
gcc -fPIC -c util.c mathx.c
gcc -shared -o libmylib.so util.o mathx.o
# link program dynamically
gcc main.c -L. -lmylib -o app
export LD_LIBRARY_PATH=.
./app
Understanding symbol visibility in shared libraries
When exporting functions from shared libraries, you can control which symbols remain public using compiler attributes or link scripts. Reducing exported symbols minimizes conflicts and keeps your library interface clean.
#ifdef _WIN32
#define API __declspec(dllexport)
#else
#define API __attribute__((visibility("default")))
#endif
API void greet(const char *name);
At link time, only the exported functions are available to other programs, helping maintain a clear separation between internal helpers and the official API surface.
Chapter 14: System Programming
System programming sits closer to the operating system than typical application development. It deals with process control, files, environment variables, and direct use of system calls. In C, this area is powerful but requires precision and care because you interact directly with kernel-managed resources. Understanding how programs communicate with the OS is key to writing efficient and reliable tools.
Interacting with the operating system
C provides several layers for interacting with the OS. At the top are the standard library functions defined by ISO C, which offer portable access to files, memory, and processes. Below that are platform-specific APIs such as POSIX for Unix-like systems or the Win32 API on Windows. System calls handle tasks such as creating files, reading input, changing directories, and obtaining system information.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
/* running an external command */
int status = system("echo Hello from shell");
if (status == -1) {
perror("system");
return 1;
}
/* getting the current working directory */
char *cwd = getenv("PWD");
if (cwd)
printf("Current directory: %s\n", cwd);
else
printf("PWD not set\n");
return 0;
}
The system() function invokes a shell to execute a command string. It is simple but limited and potentially unsafe with untrusted input. For precise control, use platform APIs like fork(), exec(), or CreateProcess() instead of invoking a shell.
system(). Never concatenate untrusted user input directly into command strings, as this can lead to code execution vulnerabilities.
Accessing system information
The standard library offers simple access to environment details such as user names, temporary paths, and limits through functions like getenv() and constants in limits.h. On POSIX systems, additional details can be retrieved using uname() or sysconf().
#include <stdio.h>
#include <unistd.h>
int main(void) {
long cpus = sysconf(_SC_NPROCESSORS_ONLN);
printf("Online CPUs: %ld\n", cpus);
return 0;
}
Handling environment variables and command-line arguments
Programs can receive information from the outside world in two main ways: through environment variables and through command-line arguments. The main() function can declare parameters int argc and char *argv[], which provide the argument count and list. The environment is accessible via getenv() and setenv() on POSIX systems.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
printf("Program name: %s\n", argv[0]);
for (int i = 1; i < argc; ++i)
printf("Arg %d: %s\n", i, argv[i]);
const char *home = getenv("HOME");
if (home)
printf("HOME = %s\n", home);
else
printf("HOME not set\n");
return 0;
}
Modifying the environment safely
POSIX provides setenv() and unsetenv() for updating environment variables, which affect the current process and any children it spawns. Use them with caution, as environment size and lifetime depend on system implementation.
#include <stdlib.h>
int main(void) {
setenv("DEBUG", "1", 1); /* overwrite existing value */
unsetenv("OLD_PATH");
return 0;
}
Working with processes and signals (POSIX overview)
In Unix-like environments, processes are fundamental units of execution. C exposes system calls such as fork(), exec(), and wait() to create and manage them. Signals are asynchronous notifications that inform a process of events like interrupts or termination requests.
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main(void) {
pid_t pid = fork();
if (pid < 0) {
perror("fork");
return 1;
} else if (pid == 0) {
/* child process */
execlp("echo", "echo", "child process running", NULL);
perror("execlp");
return 1;
} else {
/* parent process */
int status = 0;
waitpid(pid, &status, 0);
printf("Child exited with status %d\n", status);
}
return 0;
}
Catching and handling signals
You can catch signals using the signal() or sigaction() functions. This lets your program respond to events such as a user pressing Ctrl+C (SIGINT), or ensure cleanup before termination.
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
void on_signal(int sig) {
printf("Caught signal %d\n", sig);
}
int main(void) {
signal(SIGINT, on_signal);
printf("Press Ctrl+C to trigger SIGINT\n");
for (;;) sleep(1);
}
printf(), inside a handler. Use flags or simple writes instead.
Using system calls safely
System calls expose the kernel directly, providing fine control but limited safety nets. Each call must be checked for errors, as nearly all can fail due to resource exhaustion or permission issues. They usually return -1 on failure and set errno to indicate the reason.
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
int main(void) {
int fd = open("test.txt", O_RDONLY);
if (fd == -1) {
perror("open");
return 1;
}
char buf[64];
ssize_t n = read(fd, buf, sizeof buf);
if (n == -1) {
perror("read");
close(fd);
return 1;
}
write(STDOUT_FILENO, buf, n);
close(fd);
return 0;
}
Distinguishing between library functions and direct system calls
Functions such as fopen() or fprintf() are part of the C standard library and may call multiple system calls under the hood. Lower-level calls like open(), read(), and write() map directly to kernel operations, providing more control over flags, permissions, and file descriptors.
| Standard Library | System Call Layer | Purpose |
fopen() | open() | Open file stream vs file descriptor |
fread() | read() | Buffered I/O vs raw read |
fprintf() | write() | Formatted vs raw output |
fread() and fprintf() for most tasks. Drop to system calls only when you need exact control over file descriptors, performance, or asynchronous I/O.
Practising defensive error handling with system calls
Always check return values and handle interruptions gracefully. Some system calls can return early if interrupted by a signal (EINTR). In those cases, retry the operation. Looping until success or a terminal error ensures robustness.
ssize_t safe_read(int fd, void *buf, size_t count) {
ssize_t n;
do {
n = read(fd, buf, count);
} while (n == -1 && errno == EINTR);
return n;
}
System programming reveals how C interacts with the OS at its most fundamental level. Mastering these techniques provides insight into how shells, servers, and utilities operate beneath the surface, and builds the foundation for writing performant and reliable software on any platform.
Chapter 15: Networking and HTTP
C programs can talk across machines by using sockets. A socket is an operating system handle that represents one endpoint of a network conversation. In this chapter you will learn how to open sockets, how to exchange bytes over TCP, how HTTP sits on top, and how to serve tiny REST like endpoints. The focus stays on portable POSIX style code that compiles on Linux and macOS, with short notes where Windows needs a change.
man 2 socket, man 2 connect, man 2 bind, man 2 listen, man 2 accept, and man 3 getaddrinfo.
Sockets and basic network communication
A socket is created with socket(). You specify a family, a type, and a protocol. For TCP over IPv4 or IPv6 you use AF_INET or AF_INET6, SOCK_STREAM, and protocol zero. Addresses are described with structures like struct sockaddr_in and struct sockaddr_in6, which are passed to system calls by casting to struct sockaddr *. Host and network byte order can differ, so you convert with htons() and htonl() when filling port and address fields.
Creating and configuring a socket
Clients usually call socket() then connect(). Servers call socket(), bind(), listen(), then loop on accept(). For portability and DNS resolution you use getaddrinfo() to obtain one or more sockaddr results that you can try in order.
// minimal socket creation snippet
int fd = socket(AF_INET, SOCK_STREAM, 0);
if (fd == -1) { perror("socket"); return 1; }
// optional: set SO_REUSEADDR for quick restarts
int yes = 1;
if (setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof yes) == -1) {
perror("setsockopt");
}
Resolving names
getaddrinfo() turns a host name and service string into a linked list of usable addresses. You iterate that list and try to connect or bind until one succeeds.
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int connect_to(const char *host, const char *service) {
struct addrinfo hints, *res, *p;
memset(&hints, 0, sizeof hints);
hints.ai_family = AF_UNSPEC; // IPv4 or IPv6
hints.ai_socktype = SOCK_STREAM; // TCP
int rc = getaddrinfo(host, service, &hints, &res);
if (rc != 0) { fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(rc)); return -1; }
int fd = -1;
for (p = res; p; p = p->ai_next) {
fd = socket(p->ai_family, p->ai_socktype, p->ai_protocol);
if (fd == -1) continue;
if (connect(fd, p->ai_addr, p->ai_addrlen) == 0) break;
close(fd);
fd = -1;
}
freeaddrinfo(res);
return fd; // connected or -1
}
WSAStartup() before socket functions, use closesocket() instead of close(), and link with Ws2_32.lib. Function names are similar but not identical.
Sending and receiving bytes
Use send() and recv() for TCP streams. Both can return fewer bytes than requested, so you loop until all data is processed or an error occurs. A return of zero from recv() means the peer closed the connection cleanly.
ssize_t send_all(int fd, const void *buf, size_t len) {
const char *p = buf;
size_t left = len;
while (left > 0) {
ssize_t n = send(fd, p, left, 0);
if (n <= 0) return n; // error or closed
p += n;
left -= n;
}
return len;
}
Creating a TCP client and server
This section builds a tiny echo server and client. The server accepts a connection, reads some bytes, writes them back, then closes. The client connects, sends a line, reads the response, and prints it.
Writing a simple echo server
The server runs on a port, for example 8080. It uses getaddrinfo() with AI_PASSIVE to obtain a suitable local address for bind(), then listens and accepts. This example handles one client at a time to keep the flow clear.
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(void) {
struct addrinfo hints, *res, *p;
memset(&hints, 0, sizeof hints);
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_PASSIVE;
if (getaddrinfo(NULL, "8080", &hints, &res) != 0) { perror("getaddrinfo"); return 1; }
int fd = -1;
for (p = res; p; p = p->ai_next) {
fd = socket(p->ai_family, p->ai_socktype, p->ai_protocol);
if (fd == -1) continue;
int yes = 1;
setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof yes);
if (bind(fd, p->ai_addr, p->ai_addrlen) == 0) break;
close(fd);
fd = -1;
}
freeaddrinfo(res);
if (fd == -1) { fprintf(stderr, "bind failed\n"); return 1; }
if (listen(fd, 16) == -1) { perror("listen"); return 1; }
printf("Echo server listening on port 8080\n");
for (;;) {
int cfd = accept(fd, NULL, NULL);
if (cfd == -1) { perror("accept"); continue;
}
char buf[1024];
ssize_t n = recv(cfd, buf, sizeof buf, 0);
if (n > 0) {
send(cfd, buf, (size_t)n, 0);
}
close(cfd);
}
}
select() or poll(). Start with the simple loop, then evolve as requirements grow.
Writing a simple echo client
The client connects to localhost:8080, sends a short message, waits for a reply, and prints it to standard output.
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(void) {
struct addrinfo hints, *res;
memset(&hints, 0, sizeof hints);
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
if (getaddrinfo("127.0.0.1", "8080", &hints, &res) != 0) {
perror("getaddrinfo"); return 1;
}
int fd = -1;
for (struct addrinfo *p = res; p; p = p->ai_next) {
fd = socket(p->ai_family, p->ai_socktype, p->ai_protocol);
if (fd == -1) continue;
if (connect(fd, p->ai_addr, p->ai_addrlen) == 0) { break; }
close(fd);
fd = -1;
}
freeaddrinfo(res);
if (fd == -1) { fprintf(stderr, "connect failed\n"); return 1; }
const char *msg = "hello\n";
send(fd, msg, strlen(msg), 0);
char buf[1024];
ssize_t n = recv(fd, buf, sizeof buf - 1, 0);
if (n > 0) { buf[n] = '\0'; printf("got: %s", buf); }
close(fd);
return 0;
}
HTTP request-response basics in pure C
HTTP rides on TCP. The client opens a TCP connection to port 80 or 443 then sends a text request. The server reads the request line and headers, then writes back a status line, headers, and a body. You can experiment with a raw socket to see the wire format clearly.
Composing an HTTP GET by hand
This example connects to an origin server and sends a minimal HTTP 1.1 request. The host name must appear in a Host header. The connection may remain open, so you can close it after reading or include Connection: close to make life simple.
// show the literal request that will be sent
// lines end with CRLF in the protocol
GET / HTTP/1.1\r\n
Host: example.com\r\n
User-Agent: this-is-c/1.0\r\n
Accept: */*\r\n
Connection: close\r\n
\r\n
// tiny HTTP GET client
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main(void) {
struct addrinfo hints, *res;
memset(&hints, 0, sizeof hints);
hints.ai_socktype = SOCK_STREAM;
if (getaddrinfo("example.com", "80", &hints, &res) != 0) {
perror("getaddrinfo"); return 1;
}
int fd = -1;
for (struct addrinfo *p = res; p; p = p->ai_next) {
fd = socket(p->ai_family, p->ai_socktype, p->ai_protocol);
if (fd == -1) continue;
if (connect(fd, p->ai_addr, p->ai_addrlen) == 0) break;
close(fd);
fd = -1;
}
freeaddrinfo(res);
if (fd == -1) { fprintf(stderr, "connect failed\n"); return 1; }
const char *req =
"GET / HTTP/1.1\r\n"
"Host: example.com\r\n"
"User-Agent: this-is-c/1.0\r\n"
"Accept: */*\r\n"
"Connection: close\r\n"
"\r\n";
send(fd, req, strlen(req), 0);
char buf[4096];
ssize_t n;
while ((n = recv(fd, buf, sizeof buf, 0)) > 0) {
fwrite(buf, 1, (size_t)n, stdout);
}
close(fd);
return 0;
}
Parsing a minimal HTTP request on the server
To serve HTTP you need to read until you reach the blank line that ends the headers. For a first cut you can split on \r\n\r\n then parse the request line. Production servers need robust parsers and careful limits.
#include <ctype.h>
// return pointer just after CRLFCRLF or NULL if not found
static const char *find_header_end(const char *s, size_t n) {
for (size_t i = 0; i + 3 < n; ++i) {
if (s[i] == '\r' && s[i+1] == '\n' && s[i+2] == '\r' && s[i+3] == '\n')
return s + i + 4;
}
return NULL;
}
Building minimal REST-like endpoints
A REST like endpoint returns a representation of a resource for a path. You can build a tiny server that maps paths such as /time or /echo?x=... to handler functions. The example below handles two endpoints and returns JSON. It avoids a full query string parser for clarity, and it keeps responses small.
Serving simple JSON from path based handlers
Here is a compact server that accepts a connection, reads at most one request, picks a handler by prefix, and returns a JSON body with a correct Content-Type header. Error handling is trimmed to essentials to focus on the control flow.
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <time.h>
static void send_response(int cfd, int status, const char *status_text, const char *body) f
char hdr[512];
int blen = (int)strlen(body);
int n = snprintf(hdr, sizeof hdr,
"HTTP/1.1 %d %s\r\n"
"Content-Type: application/json; charset=utf-8\r\n"
"Content-Length: %d\r\n"
"Connection: close\r\n"
"\r\n", status, status_text, blen);
send(cfd, hdr, (size_t)n, 0);
send(cfd, body, (size_t)blen, 0);
}
int main(void) {
struct addrinfo hints, *res, *p;
memset(&hints, 0, sizeof hints);
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_PASSIVE;
if (getaddrinfo(NULL, "8081", &hints, &res) != 0) {
perror("getaddrinfo"); return 1;
}
int fd = -1;
for (p = res; p; p = p->ai_next) {
fd = socket(p->ai_family, p->ai_socktype, p->ai_protocol);
if (fd == -1) continue;
int yes = 1;
setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof yes);
if (bind(fd, p->ai_addr, p->ai_addrlen) == 0) break;
close(fd);
fd = -1;
}
freeaddrinfo(res);
if (fd == -1) { fprintf(stderr, "bind failed\n"); return 1; }
if (listen(fd, 16) == -1) { perror("listen"); return 1; }
printf("REST like server on http://127.0.0.1:8081/\n");
for (;;) {
int cfd = accept(fd, NULL, NULL);
if (cfd == -1) { perror("accept"); continue;
}
char req[4096];
ssize_t n = recv(cfd, req, sizeof req - 1, 0);
if (n <= 0) { close(cfd); continue; }
req[n] = '\0';
// parse the request line method and path
char method[8], path[1024];
method[0] = path[0] = '\0';
sscanf(req, "%7s %1023s", method, path);
if (strcmp(method, "GET") != 0) {
send_response(cfd, 405, "Method Not Allowed", "{ \"error\": \"only GET\" }");
close(cfd);
continue;
}
if (strcmp(path, "/time") == 0) {
char body[256];
time_t t = time(NULL);
struct tm tmv;
gmtime_r(&t, &tmv);
char iso[64];
strftime(iso, sizeof iso, "%Y-%m-%dT%H:%M:%SZ", &tmv);
snprintf(body, sizeof body, "{ \"utc\": \"%s\" }", iso);
send_response(cfd, 200, "OK", body);
}
else if (strncmp(path, "/echo", 5) == 0)
{
// naive echo of the path for demonstration
char body[512];
snprintf(body, sizeof body, "{ \"path\": \"%s\" }", path);
send_response(cfd, 200, "OK", body);
} else
{
send_response(cfd, 404, "Not Found", "{ \"error\": \"not found\" }");
}
close(cfd);
}
return 0;
}
name=value pairs separated by &. Percent decoding turns %20 into a space. Keep buffers bounded and reject lines longer than your limit.
Adding minimal concurrency
To handle more than one client, you can accept in a loop and create a thread per connection. The function below sketches a simple thread wrapper. On POSIX you can use pthread_create(). On systems without pthreads you can use a small worker loop with select() that watches many sockets.
#include <pthread.h>
struct client_arg { int fd; };
static void *serve(void *argp) {
int cfd = ((struct client_arg*)argp)->fd;
free(argp);
// handle one request then close (reuse the handler shown above)
// … copy the read and route logic here …
close(cfd);
return NULL;
}
// inside the accept loop
int cfd = accept(fd, NULL, NULL);
if (cfd != -1) {
struct client_arg *a = malloc(sizeof *a);
a->fd = cfd;
pthread_t tid;
pthread_create(&tid, NULL, serve, a);
pthread_detach(tid);
}
poll() or epoll for high concurrency once the design is proven.
Chapter 16: Cross-Platform Development
Writing code that compiles and runs on Linux, macOS, and Windows requires careful choice of APIs and disciplined use of conditional compilation. This chapter shows practical patterns for selecting the right headers and functions, for performing common file and directory tasks in a portable way, for understanding compiler and library differences, and for linking static or dynamic libraries on each platform without surprises.
#if blocks.
Conditional compilation for Linux, macOS, and Windows
Compilers define macros that identify the target operating system and toolchain. You can use these macros inside #if blocks to select the correct headers and functions. Prefer feature detection when possible; use platform detection when a feature is not present everywhere.
Recognizing platforms with predefined macros
Common predefined macros include _WIN32 for Windows (both 32 and 64 bit), __linux__ for Linux, and __APPLE__ together with __MACH__ for macOS. Toolchain macros include _MSC_VER for MSVC, __GNUC__ for GCC, and __clang__ for Clang.
#if defined(_WIN32)
/* Windows specific includes */
#include <windows.h>
#elif defined(__APPLE__)
/* macOS specific includes */
#include <TargetConditionals.h>
/* use POSIX headers as well */
#include <unistd.h>
#elif defined(__linux__)
/* Linux specific includes */
#include <unistd.h>
#else
#error "Unsupported platform"
#endif
Selecting headers and functions safely
When a function is only available on POSIX systems you can provide a Windows alternative. Keep the public function name the same and isolate differences behind the scenes.
int sleep_ms(unsigned ms) {
#if defined(_WIN32)
Sleep((DWORD)ms);
return 0;
#else
struct timespec ts = { ms / 1000, (long)(ms % 1000) * 1000000L };
return nanosleep(&ts, NULL);
#endif
}
Using feature test macros rather than guessing
Some POSIX functions require enabling symbols before including headers. For example you can set _POSIX_C_SOURCE to request specific interfaces. This avoids relying on nonstandard extensions.
#define _POSIX_C_SOURCE 200809L
#include <unistd.h>
#include <time.h>
/* now POSIX interfaces such as clock_gettime may be declared */
W variants such as CreateFileW and _wfopen, plus conversion between UTF 8 and UTF 16.
Portable file and directory handling
Most file operations are portable with the C standard library. Directory traversal and some metadata differ by platform. The goal is to use standard I O where possible and provide small shims for the parts that vary.
Opening, reading, and writing files portably
Use fopen(), fread(), fwrite(), fclose(), remove(), and rename(). On Windows you should add the "b" flag for binary data to avoid newline translation.
FILE *fp = fopen("data.bin",
#if defined(_WIN32)
"wb"
#else
"w"
#endif
);
if (!fp) { perror("fopen"); /* handle error */ }
/* write bytes... */
fclose(fp);
Querying file information with stat
stat() reports file size and type on POSIX systems. Windows has _stat and _wstat. You can wrap these behind a single helper that fills a common structure.
#if defined(_WIN32)
#include <sys/types.h>
#include <sys/stat.h>
#define STAT _stat
#else
#include <sys/stat.h>
#define STAT stat
#endif
long long file_size(const char *path) {
struct STAT st;
if (STAT(path, &st) != 0) return -1;
return (long long)st.st_size;
}
Walking directories on POSIX and Windows
POSIX provides opendir(), readdir(), and closedir(). Windows offers FindFirstFile, FindNextFile, and FindClose. A thin adapter lets your code iterate entries with the same callback signature.
/* POSIX version */
#if !defined(_WIN32)
#include <dirent.h>
int list_dir(const char *path) {
DIR *d = opendir(path);
if (!d) return -1;
struct dirent *e;
while ((e = readdir(d)) != NULL) {
if (strcmp(e->d_name, ".") == 0 || strcmp(e->d_name, "..") == 0) continue;
printf("%s\n", e->d_name);
}
closedir(d);
return 0;
}
#else
/* Windows version */
#include <windows.h>
int list_dir(const char *path) {
char pattern[MAX_PATH];
snprintf(pattern, sizeof pattern, "%s\\*", path);
WIN32_FIND_DATAA ffd;
HANDLE h = FindFirstFileA(pattern, &ffd);
if (h == INVALID_HANDLE_VALUE) return -1;
do {
const char *name = ffd.cFileName;
if (strcmp(name, ".") == 0 || strcmp(name, "..") == 0) continue;
printf("%s\n", name);
} while (FindNextFileA(h, &ffd));
FindClose(h);
return 0;
}
#endif
Differences in compilers and standard libraries
The C language is standardized, yet compilers and libraries vary in extensions, warning behavior, and default modes. Understanding these differences helps you choose flags that keep builds clean and reproducible.
Recognizing compiler dialects and choosing warning levels
GCC and Clang accept many of the same flags; MSVC uses different names. Treat warnings as guidance and aim for zero warnings across compilers. Use a strict language mode such as C17 without extensions unless an extension is required.
# GCC or Clang
cc -std=c17 -Wall -Wextra -Wpedantic -O2 -o app app.c
# MSVC (Developer Command Prompt)
cl /std:c17 /W4 /O2 app.c
Library availability and POSIX functions
glibc and musl on Linux, and libSystem on macOS, include POSIX functions such as fork(), poll(), and getline(). MSVCRT on Windows does not implement many POSIX calls. Prefer portable C functions; where you need POSIX, provide Windows alternatives.
| Topic | GCC/Clang (Linux/macOS) | MSVC (Windows) |
| Language mode | -std=c17 or -std=c23 | /std:c17 or newer |
| Warnings | -Wall -Wextra -Wpedantic | /W4 or /Wall |
| POSIX I O | Available; needs headers | Not available; use Win32 APIs |
| Threads | -pthread links pthreads | <threads.h> or Win32 threads |
| Sockets | BSD sockets in libc | Winsock; call WSAStartup |
Detecting compiler versions and working around bugs
You can branch on compiler version macros when a specific fix is required. Keep such workarounds isolated and remove them once the minimum supported version includes the fix.
#if defined(__GNUC__) && !defined(__clang__)
#if (__GNUC__ < 10)
/* apply workaround for GCC < 10 … */
#endif
#endif
-fsanitize=address,undefined on GCC and Clang during testing to catch issues early.
Static and dynamic linking on each platform
Linking produces an executable by combining object files and libraries. Static libraries are archives that become part of the binary. Dynamic libraries are loaded at run time by the loader. Each platform uses different file extensions and search rules.
Library file types and naming conventions
Linux uses .a for static libraries and .so for shared libraries. macOS uses .a and .dylib (plus frameworks). Windows uses .lib for static libraries and import libraries, and .dll for dynamic libraries.
| Platform | Static | Dynamic | Example link flag |
| Linux | libfoo.a | libfoo.so | -Lpath -lfoo |
| macOS | libfoo.a | libfoo.dylib | -Lpath -lfoo or framework flags |
| Windows | foo.lib | foo.dll + foo.lib | foo.lib on the link line |
Linking on the command line
Use the compiler driver to link so that the correct runtime libraries are chosen. Order of objects and libraries matters on Unix like linkers; place libraries after the objects that reference them.
# Linux or macOS shared link
cc -o app main.o util.o -Lthird_party/lib -lfoo
# Linux static link (may increase size)
cc -static -o app main.o -Lthird_party/lib -lfoo
# macOS with a framework
cc -o app main.o -framework CoreFoundation
# Windows MSVC link
cl /Fe:app.exe main.obj util.obj foo.lib
Loading dynamic libraries at run time
You can load a plugin at run time. POSIX systems use dlopen() and dlsym(). Windows uses LoadLibrary and GetProcAddress. Always check for errors and unload when finished.
#if defined(_WIN32)
#include <windows.h>
HMODULE h = LoadLibraryA("foo.dll");
if (!h) { /* handle error */ }
FARPROC sym = GetProcAddress(h, "foo_init");
/* call through a typed function pointer */
FreeLibrary(h);
#else
#include <dlfcn.h>
void *h = dlopen("libfoo.so", RTLD_NOW);
if (!h) { /* handle error */ }
void *sym = dlsym(h, "foo_init");
dlclose(h);
#endif
LD_LIBRARY_PATH or link with -Wl,-rpath,<path>; on macOS use DYLD_LIBRARY_PATH or -Wl,-rpath; on Windows ensure the directory containing the .dll is on PATH or next to the executable.
Chapter 17: Debugging and Profiling
Even experienced C programmers spend much of their time debugging and tuning performance. Tools such as gdb, lldb, Valgrind, and gprof make this process systematic. This chapter explains how to inspect a running program, how to recognize and fix common runtime errors, how to find memory leaks, and how to profile function level performance to guide optimization.
-g. This includes symbol names and line numbers that make your tools far more useful.
Using gdb and lldb
gdb (GNU Debugger) and lldb (LLVM Debugger) let you pause a program, inspect variables, step through code, and trace crashes. They read symbols from the compiled binary when you build with -g. Both share many concepts but differ slightly in command syntax.
Starting a program under the debugger
Compile with debug info first, then run the program inside the debugger shell. For gdb:
cc -g -O0 -o demo demo.c
gdb ./demo
(gdb) run arg1 arg2
For lldb (the default on macOS):
clang -g -O0 -o demo demo.c
lldb ./demo
(lldb) run
Setting breakpoints and stepping through code
A breakpoint stops execution at a chosen line or function so you can examine the program state.
(gdb) break main
(gdb) run
(gdb) next # step over
(gdb) step # step into
(gdb) continue # resume until next breakpoint
(gdb) print x # display variable value
lldb uses similar commands:
(lldb) breakpoint set --name main
(lldb) run
(lldb) next
(lldb) step
(lldb) frame variable x
Inspecting stack traces after a crash
When a program crashes, run it inside the debugger and type bt to show a backtrace. Each frame reveals the function chain leading to the failure.
(gdb) run
Program received signal SIGSEGV, Segmentation fault.
(gdb) bt
#0 crash_here () at demo.c:10
#1 main () at demo.c:15
-g remove symbol names, making debugging difficult. Keep separate debug builds even when releasing optimized code.
Common runtime errors and how to diagnose them
C gives you direct memory access but minimal runtime protection. This power means mistakes cause crashes or silent corruption. Learning to interpret error messages and patterns quickly is an essential skill.
Segmentation faults and invalid memory access
A segmentation fault occurs when the program touches an invalid address such as NULL or memory that has been freed. Use a debugger to find the offending line and print the pointer values involved.
(gdb) run
Program received signal SIGSEGV, Segmentation fault.
(gdb) bt
(gdb) print ptr
ulimit -c unlimited to generate a core dump on crash. You can then inspect it later with gdb ./program core.
Buffer overflows and string errors
Writing past the end of an array corrupts nearby data. Always check array bounds and prefer safer functions such as snprintf() or strncpy(). Enable AddressSanitizer during compilation to detect these errors.
cc -fsanitize=address -g -O1 -o app app.c
./app
Uninitialized variables and undefined values
Reading from uninitialized variables leads to unpredictable behavior. Modern compilers can warn about this with -Wall -Wextra, and sanitizers can detect it at runtime. Initialize every variable explicitly.
Mismatched malloc/free and resource leaks
Allocating memory without freeing it causes leaks, while freeing memory twice corrupts the heap. Tools like Valgrind (explained next) can reveal these problems automatically.
Memory leak detection with Valgrind
Valgrind is a dynamic analysis tool that simulates a CPU and monitors every memory access. It detects invalid reads, writes, and leaks, making it invaluable for debugging memory issues. It runs on Linux and macOS (Intel), though not on Windows natively.
Running a program under Valgrind
Compile with debugging symbols and minimal optimization, then run your program through Valgrind:
cc -g -O0 -o app app.c
valgrind --leak-check=full ./app
A typical leak report looks like this:
==12345== 20 bytes in 1 blocks are definitely lost in loss record 1 of 1
==12345== at 0x4846DEF: malloc (vg_replace_malloc.c:381)
==12345== by 0x109176: main (app.c:10)
The stack trace shows where memory was allocated but never freed. Fix it by calling free() on the same pointer before the function exits.
Detecting invalid memory access
Valgrind can also detect reads or writes beyond allocated blocks. These often appear as "Invalid write of size ..." messages. Each report includes the address, size, and stack trace of both the invalid access and the allocation site.
==54321== Invalid read of size 4
==54321== at 0x1091B2: print_item (list.c:42)
==54321== by 0x109246: main (app.c:18)
Profiling performance with gprof
Profiling identifies where a program spends its time so you can optimize the right functions. gprof instruments function calls and counts how often each is invoked and how much time they consume. The results guide targeted improvements rather than blind rewriting.
Compiling with profiling support
To generate profiling data, compile and link with the -pg option, then run the program normally to create a gmon.out file.
cc -pg -O2 -o sortdemo sortdemo.c
./sortdemo
This file records how often functions are called and their approximate runtime cost.
Generating and reading a gprof report
After running the program, execute gprof to produce a readable summary:
gprof ./sortdemo gmon.out > report.txt
less report.txt
The report includes a flat profile and a call graph. The flat profile lists functions sorted by total time percentage, helping you find bottlenecks quickly.
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
45.0 0.09 0.09 10000 0.00 0.00 quicksort
35.0 0.16 0.07 9999 0.00 0.00 partition
20.0 0.20 0.04 1 0.04 0.20 main
Interpreting profiling results and optimizing safely
Focus optimization on functions that dominate runtime. Often only a small part of the code causes most of the delay. Avoid premature optimization; make changes only after measuring. Rebuild and rerun the profiler after each major change to confirm improvement.
gprof's timing. For finer granularity you can use perf on Linux or Instruments on macOS to supplement traditional profiling.
Chapter 18: Interfacing with Other Languages
C is often used as a common denominator between programming languages. Many interpreters and runtimes are written in C or provide APIs to call C code directly. In this chapter you will see how to build shared libraries callable from Python with ctypes, how to integrate C with Java through the Java Native Interface (JNI), and how to embed C modules into scripting environments for customization or performance.
Creating shared libraries callable from Python (ctypes)
Python’s ctypes module lets you call C functions from shared libraries (.so, .dylib, or .dll) without writing a Python extension module. You simply define the functions in C, compile them into a shared library, and load that library at runtime in Python.
Defining a simple C library
This example defines a few functions to be called from Python. You must export them with the correct linkage attribute on Windows, and regular extern linkage on Unix like systems.
#include <stdio.h>
#ifdef _WIN32
#define EXPORT __declspec(dllexport)
#else
#define EXPORT
#endif
EXPORT int add(int a, int b) {
return a + b;
}
EXPORT void hello(const char *name) {
printf("Hello, %s!\n", name);
}
Compile it as a shared library:
# Linux
cc -shared -fPIC -o libexample.so example.c
# macOS
cc -shared -fPIC -o libexample.dylib example.c
# Windows (MSVC)
cl /LD example.c
Loading and calling from Python
In Python you can load and use the compiled library easily:
import ctypes
# adjust path or name to match your system
lib = ctypes.CDLL("./libexample.so")
# declare argument and return types
lib.add.argtypes = (ctypes.c_int, ctypes.c_int)
lib.add.restype = ctypes.c_int
result = lib.add(2, 3)
print("2 + 3 =", result)
lib.hello(b"Python")
.dll in the same directory as your script or add it to the system PATH. Also ensure calling conventions match; ctypes defaults to cdecl.
Passing arrays and structures
ctypes can map Python lists and structures to C arrays and structs. The layout must match exactly.
# Python side
class Point(ctypes.Structure):
_fields_ = [("x", ctypes.c_double),
("y", ctypes.c_double)]
p = Point(2.0, 3.5)
lib.process_point.argtypes = [ctypes.POINTER(Point)]
lib.process_point(ctypes.byref(p))
// C side
struct Point { double x; double y; };
EXPORT void process_point(struct Point *p) {
printf("x=%.2f y=%.2f\n", p->x, p->y);
}
sizeof() and Python’s ctypes.sizeof() to confirm alignment.
Calling C from Java (JNI)
The Java Native Interface (JNI) is Java’s bridge to native code. It lets Java call C or C++ functions and lets native code call back into the JVM. JNI requires generating a header from a Java class, implementing the C functions, and loading the resulting shared library.
Declaring native methods in Java
Start with a Java class that declares native methods and loads the library:
public class HelloJNI {
static {
System.loadLibrary("hello");
}
private native void sayHello(String name);
public static void main(String[] args) {
new HelloJNI().sayHello("Java");
}
}
Compile and generate a header file:
javac HelloJNI.java
javah -jni HelloJNI # older JDKs
# or, on newer JDKs:
javac -h . HelloJNI.java
Implementing the native functions in C
The generated header defines function signatures matching the Java class and method names. You must include jni.h from the JDK and use the correct naming convention.
#include <jni.h>
#include <stdio.h>
JNIEXPORT void JNICALL Java_HelloJNI_sayHello(JNIEnv *env, jobject obj, jstring name) {
const char *cname = (*env)->GetStringUTFChars(env, name, NULL);
printf("Hello from C, %s!\n", cname);
(*env)->ReleaseStringUTFChars(env, name, cname);
}
Compile and link against the JNI headers and libraries:
# Linux/macOS
cc -I"$JAVA_HOME/include" -I"$JAVA_HOME/include/linux" -fPIC -shared \
-o libhello.so HelloJNI.c
# Windows (MSVC)
cl /I "%JAVA_HOME%\include" /I "%JAVA_HOME%\include\win32" /LD HelloJNI.c
When you run the Java program, it loads libhello.so (or hello.dll) automatically and calls the native method.
Returning values to Java
Native functions can return primitives or strings. For example, returning a new string:
JNIEXPORT jstring JNICALL Java_HelloJNI_greet(JNIEnv *env, jobject obj) {
return (*env)->NewStringUTF(env, "Greetings from C");
}
Then in Java:
public native String greet();
GetStringUTFChars() or array pointers that you acquire from the JVM to avoid leaks.
Embedding C in scripting environments
Instead of calling C from another language, sometimes you embed a scripting engine inside a C program to make it extensible. Many engines provide clean C APIs for loading scripts, registering native functions, and executing code dynamically.
Embedding Python with its C API
You can include Python inside a C application to add scripting capabilities. Link against the Python library and use its API to initialize the interpreter and call Python code.
#include <Python.h>
int main(void) {
Py_Initialize();
PyRun_SimpleString("print('Hello from embedded Python')");
Py_Finalize();
return 0;
}
Compile with the Python development headers and libraries:
cc embed.c -I/usr/include/python3.12 -lpython3.12 -o embed
Using Lua as a lightweight embedded language
Lua is designed for embedding. It has a small footprint and a simple C API. After linking with the Lua library, you can execute Lua scripts and register C functions that scripts can call back.
#include <lua.h>
#include <lauxlib.h>
#include <lualib.h>
static int c_add(lua_State *L) {
double a = luaL_checknumber(L, 1);
double b = luaL_checknumber(L, 2);
lua_pushnumber(L, a + b);
return 1;
}
int main(void) {
lua_State *L = luaL_newstate();
luaL_openlibs(L);
lua_register(L, "c_add", c_add);
luaL_dostring(L, "print('Sum:', c_add(2, 3))");
lua_close(L);
return 0;
}
Choosing an embedding strategy
Embedding C into a scripting environment trades raw performance for flexibility. If you need high performance, keep heavy computation in C and expose only small, well defined entry points to the script. If flexibility matters most, expose more functions and allow the script to orchestrate behavior while C handles speed critical work.
Chapter 19: Advanced Topics
This chapter gathers a set of advanced techniques that push C closer to the hardware and to the operating system. You will be shaping memory at the bit level, invoking compiler features that produce faster code, coordinating threads with shared state, and interfacing with devices in constrained environments. Each section provides focused guidance, compact examples, and practical cautions for production work.
Bitfields and low-level data structures
Bitfields allow you to map individual bits or small ranges of bits inside a storage unit such as an unsigned int. This can mirror hardware registers or compact binary protocols. The layout is sensitive to implementation choices; you should treat bitfield order and packing as compiler defined unless you control all build variables.
Defining and packing bitfields safely
A bitfield is declared inside a struct with a colon and a width in bits. Choose an explicit underlying type to communicate intent. Avoid mixing signed and unsigned fields inside the same unit unless you truly need signed interpretation.
typedef struct {
unsigned mode : 3; /* values 0..7 */
unsigned ready : 1; /* boolean flag */
unsigned reserved : 4; /* must be zero */
unsigned payload : 8; /* small value */
} Control;
Compilers may insert padding between units to satisfy alignment. If you need a fixed binary representation for I/O, prefer manual masking with integers rather than relying on compiler packing.
Reading and writing bits with masks
Masking with shifts is portable and predictable. It is ideal when you exchange bytes with files, sockets, or hardware ports.
/* pack fields into one byte */
unsigned char pack_fields(unsigned mode, unsigned ready) {
unsigned char b = 0;
b |= (mode & 0x7u) << 5; /* top 3 bits */
b |= (ready & 0x1u) << 4; /* next bit */
/* remaining 4 bits left as zero */
return b;
}
/* extract the same fields */
void unpack_fields(unsigned char b, unsigned *mode, unsigned *ready) {
*mode = (b >> 5) & 0x7u;
*ready = (b >> 4) & 0x1u;
}
Modeling protocols and registers clearly
For device registers, make the mapping explicit with integral types plus named constants. This improves clarity and cross-compiler stability.
#define CTRL_MODE_MASK 0xE0u
#define CTRL_READY_MASK 0x10u
static inline unsigned ctrl_get_mode(unsigned char b) {
return (b & CTRL_MODE_MASK) >> 5;
}
static inline int ctrl_is_ready(unsigned char b) {
return (b & CTRL_READY_MASK) != 0u;
}
Inline assembly and compiler intrinsics
Inline assembly and intrinsics allow a program to use specific instructions or memory barriers without leaving C. Intrinsics expose selected instructions as ordinary functions; inline assembly grants full control at the cost of portability. Start with intrinsics; reach for inline assembly when there is no intrinsic and you can constrain the target set.
Using intrinsics for performance and clarity
Modern compilers provide built-ins for bit scans, population counts, byte swaps, and atomic operations. These map to single instructions on many targets and degrade gracefully where support is absent.
#include <stdint.h>
/* examples that many compilers support */
int ones = __builtin_popcount(0xF0F0F0F0u);
int leadz = __builtin_clz(0x00100000u);
uint32_t swapped = __builtin_bswap32(0x11223344u);
/* branch prediction hints */
if (__builtin_expect(ones > 16, 0)) {
/* unlikely path */
}
Intrinsics are preferable when available because they participate in optimization and register allocation. They also communicate intent to readers.
Writing minimal inline assembly blocks
Inline assembly varies by compiler. The following shows a small example that reads the time-stamp counter on x86 using GNU C syntax. Constraints and clobbers describe how assembly interacts with C variables and registers.
static inline unsigned long long rdtsc(void) {
unsigned int lo, hi;
__asm__ volatile ("rdtsc" : "=a"(lo), "=d"(hi) : : "memory");
return ((unsigned long long)hi << 32) | lo;
}
Mark blocks as volatile when the compiler must not reorder around them. Declare all clobbered registers and memory effects to avoid miscompilation.
Guarding with feature checks and fallbacks
Use conditional compilation with predefined macros to select intrinsics or assembly according to the target. Provide a readable and correct fallback implementation.
#if defined(__has_builtin)
# if __has_builtin(__builtin_bswap32)
# define HAVE_BSWAP 1
# endif
#endif
static inline uint32_t my_bswap32(uint32_t x) {
#if defined(HAVE_BSWAP) || defined(__GNUC__)
return __builtin_bswap32(x);
#else
return ((x & 0x000000FFu) << 24) |
((x & 0x0000FF00u) << 8) |
((x & 0x00FF0000u) >> 8) |
((x & 0xFF000000u) >> 24);
#endif
}
Threading and concurrency (POSIX threads overview)
Threads allow a program to perform multiple activities within one process. The POSIX threads API, known as pthreads, provides primitives for thread creation, mutual exclusion, condition waiting, and thread local storage. Careful design avoids data races and deadlocks while delivering scalability.
Creating threads and joining them
The basic lifecycle consists of creating a thread with a function pointer, passing a context pointer, and joining to collect completion. Return values travel back through pthread_join or through shared state guarded by a mutex.
#include <pthread.h>
#include <stdio.h>
void *worker(void *arg) {
int id = *(int *)arg;
printf("thread %d running\n", id);
return NULL;
}
int main(void) {
pthread_t t1, t2;
int a = 1, b = 2;
pthread_create(&t1, NULL, worker, &a);
pthread_create(&t2, NULL, worker, &b);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
return 0;
}
Sharing data with mutexes and condition variables
Use a mutex to guard shared data; use a condition variable to wait until a predicate becomes true. Always check the predicate inside a loop since signals may be delivered spuriously.
#include <pthread.h>
typedef struct {
pthread_mutex_t mu;
pthread_cond_t cv;
int ready;
} Gate;
void gate_init(Gate *g) {
pthread_mutex_init(&g->mu, NULL);
pthread_cond_init(&g->cv, NULL);
g->ready = 0;
}
void gate_wait(Gate *g) {
pthread_mutex_lock(&g->mu);
while (!g->ready) {
pthread_cond_wait(&g->cv, &g->mu);
}
pthread_mutex_unlock(&g->mu);
}
void gate_open(Gate *g) {
pthread_mutex_lock(&g->mu);
g->ready = 1;
pthread_cond_broadcast(&g->cv);
pthread_mutex_unlock(&g->mu);
}
Avoiding deadlocks and data races
Adopt a single global lock ordering for multi-mutex operations. Prefer fine-grained locks only when profiling shows contention. Use thread local storage for immutable per-thread data and atomic variables for small counters that many threads update.
#include <stdatomic.h>
_Atomic unsigned long tasks_done = 0;
void mark_done(void) {
atomic_fetch_add_explicit(&tasks_done, 1, memory_order_relaxed);
}
Working with hardware and embedded systems
Embedded programming constrains memory, timing, and power. You write code that touches memory-mapped registers, controls interrupts, and cooperates with small real-time kernels or bare-metal loops. Determinism, clarity, and measured use of resources guide every design choice.
Accessing memory-mapped I/O registers
Hardware registers appear at fixed addresses. Use volatile to prevent the compiler from optimizing away required reads or writes. Prefer descriptive names and wrap addresses behind small inline helpers.
#include <stdint.h>
#define UART0_BASE ((uintptr_t)0x40000000u)
#define UART_DR (*(volatile uint32_t *)(UART0_BASE + 0x00u))
#define UART_SR (*(volatile uint32_t *)(UART0_BASE + 0x04u))
#define UART_TX_READY (1u << 5)
static inline void uart_putc(char c) {
while ((UART_SR & UART_TX_READY) == 0u) { /* wait */ }
UART_DR = (uint32_t)(unsigned char)c;
}
Only apply volatile to the object representing the device register. Keep regular variables non-volatile so the optimizer can still improve surrounding code.
Structuring interrupt-safe code paths
Interrupt handlers should be small and predictable. Move heavy work into a deferred context such as a task or a work queue. Share data using lock-free ring buffers or flags that the main loop polls, while guarding multi-byte updates with simple critical sections.
/* pseudo-code; platform glue elided … */
volatile unsigned char rx_buf[256];
volatile unsigned int rx_head = 0, rx_tail = 0;
void isr_uart_rx(void) {
unsigned char b = (unsigned char)UART_DR;
rx_buf[rx_head++ & 255u] = b; /* trivial ring buffer */
}
Managing resources in constrained builds
Constrained targets reward simple allocators and static storage. Prefer fixed-size pools, compile-time configuration, and careful logging that can be disabled. Validate all error returns from drivers; transient faults are normal in real hardware.
#ifndef LOG_LEVEL
#define LOG_LEVEL 1 /* 0=off, 1=errors, 2=info */
#endif
#if LOG_LEVEL >= 1
#define LOGE(msg) do { uart_putc('!'); /* write msg … */ } while (0)
#else
#define LOGE(msg) do { } while (0)
#endif
These advanced techniques let you shape bits, guide the compiler, coordinate threads, and tame embedded devices. Use them with measured intent and with tests that prove behavior across compilers and targets.
Chapter 20: Final Project and Next Steps
This chapter brings the threads together by building a small cross platform command line program, then extending it with persistence or networking. You will package the result with a tidy Makefile and documentation, then consider testing and standards before choosing a path forward. The goal is finishing with a polished artifact that runs on Linux, macOS, and Windows.
Building a small cross-platform command-line app
The project is a minimal to do list utility named xptodo. It stores one item per line in a text file under the user profile directory; it supports commands add, list, and done. The design favors portability; the program relies on the C standard library, uses platform guards only for locating the home directory, and avoids nonstandard console APIs.
Designing a portable structure
Keep one translation unit for the command and a small header for shared declarations. The only platform difference is the function that returns a writable path for the data file. Everything else is ordinary C that compiles everywhere.
/* file: xptodo.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static const char *data_path(void) {
#if defined(_WIN32)
const char *base = getenv("USERPROFILE");
if (!base) base = ".";
/* returns a pointer to static storage for simplicity */
static char buf[1024];
snprintf(buf, sizeof buf, "%s\\xptodo.txt", base);
return buf;
#else
const char *base = getenv("HOME");
if (!base) base = ".";
static char buf[1024];
snprintf(buf, sizeof buf, "%s/.xptodo", base);
return buf;
#endif
}
static int cmd_list(void) {
FILE *f = fopen(data_path(), "r");
if (!f) { puts("(empty)"); return 0; }
char line[1024];
int n = 1;
while (fgets(line, sizeof line, f)) {
size_t len = strlen(line);
if (len > 0 && line[len - 1] == '\n') line[len - 1] = '\0';
printf("%d: %s\n", n++, line);
}
fclose(f);
return 0;
}
static int cmd_add(const char *text) {
FILE *f = fopen(data_path(), "a");
if (!f) { perror("open"); return 1; }
fprintf(f, "%s\n", text);
fclose(f);
return 0;
}
static int cmd_done(int index) {
FILE *in = fopen(data_path(), "r");
if (!in) { perror("open"); return 1; }
FILE *out = tmpfile();
if (!out) { perror("tmpfile"); fclose(in); return 1; }
char line[1024];
int n = 1, removed = 0;
while (fgets(line, sizeof line, in)) {
if (n++ == index) { removed = 1; continue; }
fputs(line, out);
}
rewind(out);
freopen(data_path(), "w", in); /* reopen original as writable */
while (fgets(line, sizeof line, out)) fputs(line, in);
fclose(in);
fclose(out);
if (!removed) { fprintf(stderr, "no such item\n"); return 1; }
return 0;
}
int main(int argc, char **argv) {
if (argc < 2) {
fprintf(stderr, "usage: %s add <text> | list | done <n>\n", argv[0]);
return 1;
}
if (strcmp(argv[1], "list") == 0) {
return cmd_list();
} else if (strcmp(argv[1], "add") == 0) {
if (argc < 3) { fputs("missing text\n", stderr); return 1; }
/* join remaining args separated by spaces */
char buf[2048] = {0};
for (int i = 2; i < argc; ++i) {
if (i > 2) strcat(buf, " ");
strcat(buf, argv[i]);
}
return cmd_add(buf);
} else if (strcmp(argv[1], "done") == 0) {
if (argc < 3) { fputs("missing index\n", stderr); return 1; }
return cmd_done(atoi(argv[2]));
}
fputs("unknown command\n", stderr);
return 1;
}
data_path(). This keeps the rest of the program clean and testable without preprocessor branches.
Compiling and running across platforms
On Linux or macOS you can compile with cc -std=c17 -Wall -Wextra -O2 xptodo.c -o xptodo. On Windows you can use MSVC with cl /std:c17 /W4 /O2 xptodo.c or MinGW with gcc -std=c17 -Wall -Wextra -O2 xptodo.c -o xptodo.exe. The program writes its data file to a sensible default in each system.
Extending it with networking or file persistence
The first extension adds simple persistence with a lock file to prevent clobbering when multiple instances run. The optional alternative adds a tiny networking feature to fetch a remote list; networking needs separate code paths for POSIX sockets and Winsock.
Adding safe file persistence with a lock
Use an advisory lock mechanism when available; fall back to a rudimentary lock file. This keeps the design portable enough for everyday use.
/* naive lock file; suitable for a single user on one machine */
static int with_lock(int (*fn)(void *), void *arg) {
char lockpath[1024];
snprintf(lockpath, sizeof lockpath, "%s.lock", data_path());
FILE *lk = fopen(lockpath, "wx"); /* fail if exists; not on MSVC … */
if (!lk) { fputs("busy; try again\n", stderr); return 1; }
int rc = fn(arg);
fclose(lk);
remove(lockpath);
return rc;
}
flock on BSD like systems or LockFile on Windows, or use a small single process daemon that arbitrates access.
Adding a tiny networking fetch
Networking is optional. The idea is fetching a remote plaintext list and merging it into the local file. The socket setup differs between POSIX and Winsock; you gate the platform specific bits behind a helper named net_fetch().
/* signatures only; platform glue omitted … */
int net_fetch(const char *host, const char *port, const char *path,
char *buf, size_t buflen);
/* usage */
static int cmd_pull(const char *url) {
/* parse url "http://host:port/path" very simply … */
char page[8192];
if (net_fetch("example.com", "80", "/xptodo.txt", page, sizeof page) != 0) {
return 1;
}
/* append to local store line by line … */
return 0;
}
net_posix.c and net_win.c; compile the correct file by choosing a target in your Makefile. This keeps the main program unchanged while you extend capability.
Packaging, documentation, and Makefile distribution
A small but careful Makefile plus a short README and a license allow others to build and use the tool. Declare variables for the compiler and flags; add standard targets such as all, clean, test, and install. Avoid shell features that are not portable.
Writing a portable Makefile
# file: Makefile
CC ?= cc
CFLAGS ?= -std=c17 -Wall -Wextra -O2
BIN ?= xptodo
SRC = xptodo.c
OBJ = $(SRC:.c=.o)
.PHONY: all clean test install uninstall dist
all: $(BIN)
$(BIN): $(OBJ)
$(CC) $(CFLAGS) $(OBJ) -o $(BIN)
%.o: %.c
$(CC) $(CFLAGS) -c $< -o $@
clean:
rm -f $(OBJ) $(BIN)
test: $(BIN)
./$(BIN) add "smoke test item"
./$(BIN) list
install: $(BIN)
mkdir -p $(DESTDIR)/usr/local/bin
cp $(BIN) $(DESTDIR)/usr/local/bin/$(BIN)
uninstall:
rm -f $(DESTDIR)/usr/local/bin/$(BIN)
dist:
mkdir -p dist/xptodo-1.0
cp xptodo.c Makefile README.md LICENSE dist/xptodo-1.0
tar -czf dist/xptodo-1.0.tar.gz -C dist xptodo-1.0
On Windows, build with MinGW or clang; the same Makefile works when make is available. When using MSVC, supply a tiny build.bat that mirrors the commands, since nmake syntax differs.
Writing concise documentation
# file: README.md
xptodo — a minimal cross platform todo list
Build
make # uses CC and CFLAGS
CC=clang make
Usage
xptodo add "buy milk"
xptodo list
xptodo done 1
Data file
Linux, macOS: $HOME/.xptodo
Windows: %USERPROFILE%\xptodo.txt
License
MIT
Testing, Standards, and Beyond
Testing catches regressions and clarifies intent. Start with simple assertions and smoke tests; add static analysis and sanitizers; then pin a code style and a warning profile.
Adding quick tests and sanitizers
Write a shell script that runs the binary and checks outputs. On platforms without a shell, a small C test driver can do the same. When available, compile with sanitizers to catch memory issues at runtime.
# file: test_smoke.sh
set -eu
rm -f ~/.xptodo
./xptodo add "alpha"
./xptodo add "beta"
out=$(./xptodo list | wc -l)
[ "$out" -ge 2 ] && echo "ok"
# add to CFLAGS during local testing
CFLAGS += -fsanitize=address,undefined -fno-omit-frame-pointer
Adopting warnings and static analysis
Compile with -Wall -Wextra -Werror when developing, then run a static tool such as clang-tidy or cppcheck. Keep README.md honest by listing the exact profiles you use so others can reproduce the checks.
Following a clear coding standard
Pick a straightforward style: two space indentation to match ebook layout, 80 to 100 columns, meaningful names, and short functions. Require that every function has a one line comment that explains purpose and side effects. Enforce consistent error handling by returning integers for status and printing messages in one place.
Where to go next
After shipping a portable C utility you can deepen your stack in several directions. Each path opens new abstractions while keeping your grounding in systems thinking.
Choosing C++ for zero cost abstractions
C++ provides templates, RAII, and a rich standard library for containers and algorithms. You keep control of memory and layout while gaining expressive tools that help manage complexity. Translating xptodo to C++ would replace manual file handling with fstream and error handling with exceptions or std::expected equivalents.
Choosing Rust for memory safety guarantees
Rust offers ownership, borrowing, and a proven toolchain that eliminates entire classes of memory bugs. The language integrates cargo for builds and testing; translating xptodo to Rust demonstrates how lifetimes and pattern matching simplify persistence and parsing while preserving performance.
Specializing in systems topics
If you remain in C, consider deeper areas: writing libraries with stable ABIs, building event driven servers with poll or kqueue, implementing cross platform file watchers, creating FFI boundaries for Python or Java, or working on embedded firmware where timing and power shape every design choice. The skills you practiced in this project transfer directly.
Chapter 21: Miscellaneous Extras
This final section collects practical reference material that supports everyday C programming. It lists common compiler flags and optimization levels, reviews format specifiers for printf() and scanf(), provides a few ready to use Makefile templates, and recommends widely used libraries that extend C in safe and productive ways.
Common compiler flags and optimization levels
Compilers offer many switches to control warnings, debugging, and optimization. Knowing the most important ones helps you tune builds for development or release. The following tables summarize typical flags for GCC and Clang, followed by those for MSVC.
GCC and Clang essentials
| Flag | Purpose |
-Wall -Wextra | Enable a broad set of warnings |
-Werror | Treat warnings as errors |
-g | Generate debugging symbols |
-O0 | No optimization, easier debugging |
-O2 | Standard release optimization |
-O3 | Aggressive optimization, may increase size |
-Os | Optimize for smaller binaries |
-march=native | Use local CPU features for speed |
-fsanitize=address,undefined | Runtime checking for memory and UB |
-std=c17 | Specify language standard version |
Combine these in scripts or Makefiles to produce consistent builds. During development prefer -O0 -g for easier debugging; for release use -O2 or -O3 with tested sanitizer runs.
-Wall, -Wextra) from optimization. High optimization can make debugging harder because variables may be optimized away or reordered.
MSVC equivalents
| Flag | Purpose |
/W4 | Enable level 4 warnings |
/WX | Treat warnings as errors |
/Zi | Include debugging information |
/Od | Disable optimization |
/O2 | Maximize speed |
/Os | Favor small code size |
/std:c17 | Use the C17 language mode |
MSVC uses a different syntax but similar goals. The combination /W4 /WX /O2 provides strong warnings and efficient release builds.
Quick reference for format specifiers
Formatting strings with printf() and reading input with scanf() are core C tasks. The tables below summarize the most common specifiers for values and their meanings.
printf() specifiers
| Specifier | Meaning |
%d | Signed integer (int) |
%u | Unsigned integer (unsigned int) |
%ld | Signed long |
%lu | Unsigned long |
%f | Floating point (double) |
%e or %E | Exponential notation |
%c | Single character |
%s | String (null terminated) |
%p | Pointer address |
%x or %X | Hexadecimal integer |
%% | Literal percent sign |
scanf() specifiers
| Specifier | Meaning |
%d | Read integer into int * |
%u | Read unsigned integer |
%f | Read float |
%lf | Read double |
%c | Read single character |
%s | Read string until whitespace |
%p | Read pointer value (implementation defined) |
Sample Makefile templates for single-file and multi-file projects
Makefiles save time by automating builds. These small templates serve as a quick starting point for typical scenarios.
Single-file project
# file: Makefile
CC = cc
CFLAGS = -std=c17 -Wall -Wextra -O2
BIN = hello
$(BIN): hello.c
$(CC) $(CFLAGS) $< -o $@
clean:
rm -f $(BIN)
Multi-file project
# file: Makefile
CC = cc
CFLAGS = -std=c17 -Wall -Wextra -O2
OBJ = main.o util.o io.o
BIN = myapp
all: $(BIN)
$(BIN): $(OBJ)
$(CC) $(CFLAGS) $(OBJ) -o $(BIN)
%.o: %.c
$(CC) $(CFLAGS) -c $< -o $@
clean:
rm -f $(OBJ) $(BIN)
make CFLAGS="-O3 -march=native". This makes builds flexible without changing the file.
Recommended external libraries (curl, sqlite, ncurses, etc.)
C’s standard library is intentionally small. Linking external libraries gives access to powerful functionality such as networking, databases, and user interfaces. The following list names portable, well supported options suitable for learning and real projects.
| Library | Purpose |
libcurl | HTTP, FTP, and general URL transfer; supports SSL |
sqlite3 | Lightweight embedded SQL database |
ncurses | Terminal screen handling and color control |
zlib | Compression and decompression of data streams |
OpenSSL | Cryptography and secure communication |
libpng | PNG image reading and writing |
SDL2 | Cross-platform graphics, sound, and input |
PThreads | POSIX threading library for concurrency |
Install these libraries through your platform’s package manager (for example apt install libcurl-dev, brew install sqlite, or vcpkg install curl). Each library provides headers and linkable binaries usable with -l flags.
With these extras you have a concise toolkit for real world C programming. Keep these references nearby as you move into more ambitious applications or deeper systems work.
© 2025 Robin Nixon. All rights reserved
No content may be re-used, sold, given away, or used for training AI without express permission
Questions? Feedback? Get in touch