Processes, Forks, and the Unix Execution Model: What a Shell Forces You to Understand

Building a shell in C means implementing fork(), execve(), pipe(), dup2(), and signal handling from scratch. This is the Unix process model, not abstracted, not wrapped, raw.

2025-03-05

Processes, Forks, and the Unix Execution Model: What a Shell Forces You to Understand

When I started building minishell, a Bash-like shell in C, I thought I understood processes. I'd used fork() in exercises, called exec family functions, written simple pipe programs. I was wrong about how much I understood.

A shell is the program that makes every other program run. To implement one correctly, you have to implement the Unix process model correctly: not a subset of it, not a polite version of it, all of it. Every edge case in the model becomes a bug in your shell if you don't handle it.

What a Process Actually Is

Before fork() makes sense, you need a precise definition of what a process is.

A process is not a program. A program is a binary file on disk. A process is the kernel's representation of a running instance of that program. The kernel tracks each process as a data structure containing:

Address space: the virtual memory map: code segment, data segment, heap, stack
File descriptor table: the per-process array of open FDs (0=stdin, 1=stdout, 2=stderr, plus any others)
Signal disposition table: for each signal, whether it's ignored, handled by default, or handled by a custom function
PID / PPID: the process's own ID and its parent's ID
Current working directory
Environment: the KEY=VALUE string array accessible via environ

When a process is created, it inherits most of these from its parent. When a process calls execve(), most of them survive. Understanding which things cross which boundaries (fork, exec, fork+exec) is the entire job.

`fork()`: Duplicating, Not Spawning

The Unix way to create a new process is not "spawn a new process with this program." It's "duplicate the current process, then optionally replace the duplicate's program."

pid_t pid = fork();
 
if (pid < 0) {
    // fork failed (too many processes, out of memory)
    perror("fork");
    exit(1);
} else if (pid == 0) {
    // This is the CHILD process
    // pid == 0 is the child's signal
    printf("I am the child, my PID is %d\n", getpid());
} else {
    // This is the PARENT process
    // pid == child's PID
    printf("I am the parent, child PID is %d\n", pid);
}

After fork(), two processes run the same code from the same point. They differ in one thing: the return value of fork(). The child gets 0, the parent gets the child's PID.

Copy-on-write: the kernel doesn't immediately copy the parent's entire address space. Both processes initially share the same physical memory pages, marked read-only. The first time either process writes to a page, the kernel creates a private copy for that process. This makes fork() cheap even for large processes.

What the child inherits:

A copy of the parent's file descriptor table (same open FDs, same offsets)
A copy of the parent's address space (variables, heap, stack, via CoW)
A copy of the parent's signal disposition table

What the child does NOT inherit:

The parent's PID (obviously)
Pending signals
Memory locks
File locks

`execve()`: Point of No Return

execve() replaces the calling process's program image with a new one:

char *argv[] = { "/bin/ls", "-la", NULL };
char *envp[] = { "PATH=/usr/bin:/bin", NULL };
 
execve("/bin/ls", argv, envp);
// If execve() returns, it failed
perror("execve");
exit(1);

If execve() succeeds, it does not return. The process's code segment, data segment, heap, and stack are all replaced with those of the new program. The new program starts executing from its main().

What survives execve():

PID (the process is the same, just running a different program)
Parent PID
File descriptor table, unless FDs have O_CLOEXEC set
Current working directory
Signal dispositions set to SIG_IGN are preserved; custom handlers are reset to default

The O_CLOEXEC flag is critical for shell implementation. When you open a file or create a pipe, you set O_CLOEXEC on FDs that should not be inherited by child programs. Otherwise, ls running inside your shell would have all your shell's internal FDs open, which is both a resource leak and a security issue.

Why the Two Steps?

Every other operating system offers something like spawn(): create a new process and run a program in one call. Unix has fork() + execve(). Why?

The answer is in what happens between the two calls. After fork() but before execve(), you're in the child process with the parent's FDs, but the new program hasn't started yet. This gap is where redirections and pipe wiring happen:

pid_t pid = fork();
if (pid == 0) {
    // Child: wire up I/O before exec
    int fd = open("output.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
    dup2(fd, STDOUT_FILENO);  // stdout -> output.txt
    close(fd);                // close the original fd, keep the dup
    
    execve("/bin/ls", argv, envp);
}

ls has no idea its stdout is a file. It just writes to stdout (FD 1). The kernel sends those writes to output.txt because you wired it up before exec. A single spawn() syscall can't express this. There's no way to configure the child's FDs without first being in the child process.

The fork+exec split is the reason every redirection in every Unix shell works the way it does.

`pipe()`: A Kernel Ring Buffer

A pipe is two file descriptors, a read end and a write end, backed by a fixed-size ring buffer in the kernel (typically 64KB on Linux):

int pipefd[2];
pipe(pipefd);
// pipefd[0]: read end
// pipefd[1]: write end

Data written to pipefd[1] accumulates in the kernel buffer until read from pipefd[0]. If the buffer fills up, write() blocks until the reader consumes some data. If the write end is closed and the buffer is empty, read() on the read end returns 0 (EOF).

That EOF-on-close behavior is what makes pipelines terminate correctly. When ls finishes and exits, the kernel closes its end of the pipe. grep sees EOF on its stdin and exits too.

The FD Dance in a Pipeline

ls | grep .c means: run ls with its stdout connected to a pipe, and run grep with its stdin connected to the read end of that same pipe.

int pipefd[2];
pipe(pipefd);
 
// Fork first child (ls)
pid_t pid1 = fork();
if (pid1 == 0) {
    close(pipefd[0]);                       // child doesn't read from pipe
    dup2(pipefd[1], STDOUT_FILENO);         // stdout -> pipe write end
    close(pipefd[1]);                       // close original (dup2 keeps a copy)
    execve("/bin/ls", ls_argv, envp);
}
 
// Fork second child (grep)
pid_t pid2 = fork();
if (pid2 == 0) {
    close(pipefd[1]);                       // child doesn't write to pipe
    dup2(pipefd[0], STDIN_FILENO);          // stdin -> pipe read end
    close(pipefd[0]);                       // close original
    execve("/bin/grep", grep_argv, envp);
}
 
// Parent: close BOTH pipe ends
close(pipefd[0]);
close(pipefd[1]);
 
// Wait for both children
waitpid(pid1, NULL, 0);
waitpid(pid2, NULL, 0);

The closing pattern is non-negotiable. If the parent doesn't close pipefd[1], grep will never see EOF. The pipe's write end is still open, held by the parent, so the kernel won't signal EOF even after ls exits. grep hangs forever waiting for more input. This exact bug is what most people hit when they first implement pipelines.

Multiple Pipes

ls | grep .c | wc -l requires two pipes and three processes. The pattern generalizes: for N commands, create N-1 pipes before any forking, then wire each child correctly.

cmd0 -> pipe0_write
pipe0_read -> cmd1 -> pipe1_write
pipe1_read -> cmd2

The tricky part: each child must close all pipe ends it doesn't use. With three commands and two pipes, child 1 must close pipe0[0], pipe1[0], and pipe1[1]. Only pipe0[1] (its write end) should remain open. If any other end leaks into any child, the pipeline hangs.

`waitpid()` and Zombie Processes

After forking, the parent must call waitpid() (or wait()) to collect the exit status of each child:

int status;
pid_t exited = waitpid(pid, &status, 0);
 
if (WIFEXITED(status)) {
    int code = WEXITSTATUS(status);   // 0 = success, non-zero = error
}
if (WIFSIGNALED(status)) {
    int sig = WTERMSIG(status);       // killed by this signal
}

If the parent doesn't call waitpid(), the child becomes a zombie: its process entry stays in the kernel's process table holding the exit code, consuming a PID slot, waiting to be reaped. The entry can only be freed once the parent calls waitpid().

For a shell, $? is set to the exit code of the last command: WEXITSTATUS(status) if the process exited normally, or 128 + WTERMSIG(status) if it was killed by a signal. That's the convention Bash follows.

Signal Handling Across Fork

Signals complicate everything because of how they interact with fork.

The child inherits the parent's signal disposition table. If the parent has set SIGINT to be ignored, the child starts with SIGINT ignored too. After execve(), ignored signals stay ignored; custom handlers are reset to default.

In a shell, pressing Ctrl-C sends SIGINT to the process group, the shell and all its children. The shell should not terminate from Ctrl-C; it should just interrupt the current command. The running child command should receive SIGINT and die.

This requires careful setup. The issue is SIG_IGN: if the shell ignores SIGINT while in certain states (like waitpid() inside a pipeline), that ignored disposition carries into the child if you forget to reset it before exec.

// In the child, before execve:
signal(SIGINT, SIG_DFL);   // reset to default, let the child die on Ctrl-C
signal(SIGQUIT, SIG_DFL);

The parent shell restores its interactive signal handler after waitpid() returns.

Heredoc as a Pipe

Heredoc (<< EOF) lets you embed multiline input directly in a command:

cat << EOF
line one
line two
EOF

The implementation: the shell reads lines from the terminal until it sees the delimiter, writes them into a pipe's write end, then closes that end. The command's stdin is connected to the read end of that pipe. From cat's perspective, it's just reading from stdin. It has no idea the data came from the shell's input rather than a file.

int heredoc_pipe[2];
pipe(heredoc_pipe);
 
// Parent writes the heredoc content into the pipe
write(heredoc_pipe[1], content, strlen(content));
close(heredoc_pipe[1]);    // close write end, EOF for the reader
 
// Child: stdin = pipe read end
dup2(heredoc_pipe[0], STDIN_FILENO);
close(heredoc_pipe[0]);

Heredoc is a special case of redirection, and redirection is a special case of the general FD manipulation pattern that all pipes use.

Built-ins That Cannot Fork

Most commands run in child processes. But some must run in the parent shell itself:

cd: changes the current working directory. If it ran in a child, the chdir() call would affect the child's CWD, the child would exit, and the parent's CWD would be unchanged. cd in a subshell is a no-op.
export / unset: modify the shell's environment variables. Must mutate the parent's environ.
exit: exits the shell itself. Running it in a child would only exit the child.
echo / pwd: technically could fork, but shells implement them as built-ins for performance, no need to find and exec an external binary.

The shell checks if a command is a built-in before forking. If it is, it runs directly in the parent process.

Lexing the Command Line

Before any forking happens, the shell has to parse the input string into tokens. This is more involved than it looks.

Single quotes preserve everything literally: '$HOME' becomes the string $HOME, not the value of HOME. Double quotes allow variable expansion but suppress most other special characters. Outside quotes, characters like |, <, >, and ; have syntactic meaning.

Tokenization has to be state-aware:

typedef enum { NORMAL, IN_SINGLE_QUOTE, IN_DOUBLE_QUOTE } e_state;
 
// Scan character by character, maintaining state
// A '|' outside of quotes -> pipe token
// A '|' inside quotes -> literal character

Variable expansion ($VAR, $?) happens during or after tokenization, before the command is executed. $? expands to the exit code of the last foreground command, so you have to track that value across the entire shell lifetime.

The shell grammar, even for a simple subset of Bash, has a surprising number of edge cases: empty quotes (''), adjacent quoted and unquoted strings ("foo"bar), backslash escaping, and the interaction between expansion and word splitting.

What It Left Me With

Building minishell is an exercise in reading the Unix process model as a specification and implementing it with no gaps. Every system call you skip becomes a resource leak or a behavioral difference from Bash that you'll eventually have to fix.

Once you've built it, you understand what happens every time you run any command in any terminal on any Unix system. There's no magic. It's the same four syscalls in the same order: fork(), optional FD manipulation, execve(), waitpid(). Everything else is built on top of that.

Processes, Forks, and the Unix Execution Model: What a Shell Forces You to Understand

What a Process Actually Is

fork(): Duplicating, Not Spawning

execve(): Point of No Return

Why the Two Steps?

pipe(): A Kernel Ring Buffer

The FD Dance in a Pipeline

Multiple Pipes

waitpid() and Zombie Processes

Signal Handling Across Fork

Heredoc as a Pipe

Built-ins That Cannot Fork

Lexing the Command Line

What It Left Me With

`fork()`: Duplicating, Not Spawning

`execve()`: Point of No Return

`pipe()`: A Kernel Ring Buffer

`waitpid()` and Zombie Processes