EDIT:
- 2021-07-15: Added
Appendix: Dockerfile for Kernel development
and updated busybox + Kernel versions. - 2023-11-23: Fix ramdisk vs ramfs (ref), and use
devtmpfs
and updated busybox + Kernel versions.
The other evening while starring at some Linux kernel code I thought, let me setup a minimal environment so I can easily step through the code and examine the state.
I ended up creating:
- a Linux kernel with minimal configuration
- a minimal initramfs to boot into which is based on busybox
In the remaing part of this article we will go through each step by first building the kernel, then building the initrd and then running the kernel using QEMU and debugging it with GDB.
$> make kernel
Before building the kernel we first need to generate a configuration. As a
starting point we generate a minimal config with the make tinyconfig
make
target. Running this command will generate a .config
file. After generating
the initial config file we customize the kernel using the merge fragment flow.
This allows us to merge a fragment file into the current configuration by
running the scripts/kconfig/merge_config.sh
script.
Let's quickly go over some customizations we do. The following two lines enable support for gzipped initramfs:
CONFIG_BLK_DEV_INITRD=y
CONFIG_RD_GZIP=y
The next two configurations are important as they enable the binary loaders for ELF and script #! files.
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_SCRIPT=y
Note: In the cursed based configuration
make menuconfig
we can search for configurations using the/
key and then select a match using the number keys. After selecting a match we can check theHelp
to get a description for the configuration parameter.
Building the kernel with the default make target will give us the following two files:
vmlinux
statically linked kernel (ELF file) containing symbol information for debuggingarch/x86_64/boot/bzImage
compressed kernel image for booting
Full configure & build script:
#!/bin/bash
set -e
LINUX=linux-6.6.2
wget https://cdn.kernel.org/pub/linux/kernel/v6.x/$LINUX.tar.xz
unxz $LINUX.tar.xz && tar xf $LINUX.tar
cd $LINUX
cat <<EOF > kernel_fragment.config
# 64bit kernel
CONFIG_64BIT=y
# enable support for compressed initrd (gzip)
CONFIG_BLK_DEV_INITRD=y
CONFIG_RD_GZIP=y
# support for ELF and #! binary format
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_SCRIPT=y
# /dev
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
# tty & console
CONFIG_TTY=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
# pseudo fs
CONFIG_PROC_FS=y
CONFIG_SYSFS=y
# debugging
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_INFO=y
## tinyconfig sets DEBUG_INFO_NONE, overwrite with toolchain default else
## DEBUG_INFO will not be enabled.
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_PRINTK=y
CONFIG_EARLY_PRINTK=y
EOF
make tinyconfig
./scripts/kconfig/merge_config.sh -n ./kernel_fragment.config
make -j$(nproc --ignore=2)
$> make initrd
Next step is to build the initrd which we base on busybox. Therefore we first build the busybox project in its default configuration with one change, we enable following configuration to build a static binary so it can be used stand-alone:
sed -i 's/# CONFIG_STATIC .*/CONFIG_STATIC=y/' .config
One important step before creating the final initrd is to create an init process. This will be the first process executed in userspace after the kernel finished its initialization. We just create a script that drops us into a shell:
cat <<EOF > init
#!/bin/sh
mount -t proc none /proc
mount -t sysfs none /sys
mount -t devtmpfs none /dev
exec setsid cttyhack sh
EOF
By default the kernel looks for
/sbin/init
in the root file system, but the location can optionally be specified with theinit=
kernel parameter.
Full busybox & initrd build script:
#!/bin/bash
if test $(id -u) -ne 0; then
SUDO=sudo
fi
set -e
BUSYBOX=busybox-1.36.1
INITRD=$PWD/initramfs.cpio.gz
## Build busybox
echo "[+] configure & build $BUSYBOX ..."
[[ ! -d $BUSYBOX ]] && {
wget https://busybox.net/downloads/$BUSYBOX.tar.bz2
bunzip2 $BUSYBOX.tar.bz2 && tar xf $BUSYBOX.tar
}
cd $BUSYBOX
make defconfig
sed -i 's/# CONFIG_STATIC .*/CONFIG_STATIC=y/' .config
make -j$(nproc --ignore=2) busybox
make install
## Create initrd
echo "[+] create initrd $INITRD ..."
cd _install
# 1. create initrd folder structure
mkdir -p bin sbin etc proc sys usr/bin usr/sbin dev
# 2. create init process
cat <<EOF > init
#!/bin/sh
mount -t proc none /proc
mount -t sysfs none /sys
mount -t devtmpfs none /dev
exec setsid cttyhack sh
EOF
chmod +x init
# 3. created compressed initrd
find . -print0 \
| cpio --null -ov --format=newc \
| gzip -9 > $INITRD
Running QEMU && GDB
After finishing the previous steps we have all we need to run and debug the
kernel. We have arch/x86/boot/bzImage
and initramfs.cpio.gz
to boot the
kernel into a shell and we have vmlinux
to feed the debugger with debug
symbols.
We start QEMU as follows, thanks to the -S
flag the CPU will freeze until we
connected the debugger:
# -S freeze CPU until debugger connected
> qemu-system-x86_64 \
-kernel ./linux-5.3.7/arch/x86/boot/bzImage \
-nographic \
-append "earlyprintk=ttyS0 console=ttyS0 nokaslr init=/init debug" \
-initrd ./initramfs.cpio.gz \
-gdb tcp::1234 \
-S
Then we can start GDB and connect to the GDB server running in QEMU (configured
via -gdb tcp::1234
). From now on we can start to debug through the
kernel.
> gdb linux-5.3.7/vmlinux -ex 'target remote :1234'
(gdb) b do_execve
Breakpoint 1 at 0xffffffff810a1a60: file fs/exec.c, line 1885.
(gdb) c
Breakpoint 1, do_execve (filename=0xffff888000060000, __argv=0xffffffff8181e160 <argv_init>, __envp=0xffffffff8181e040 <envp_init>) at fs/exec.c:1885
1885 return do_execveat_common(AT_FDCWD, filename, argv, envp, 0);
(gdb) bt
#0 do_execve (filename=0xffff888000060000, __argv=0xffffffff8181e160 <argv_init>, __envp=0xffffffff8181e040 <envp_init>) at fs/exec.c:1885
#1 0xffffffff81000498 in run_init_process (init_filename=<optimized out>) at init/main.c:1048
#2 0xffffffff81116b75 in kernel_init (unused=<optimized out>) at init/main.c:1129
#3 0xffffffff8120014f in ret_from_fork () at arch/x86/entry/entry_64.S:352
#4 0x0000000000000000 in ?? ()
(gdb)
Appendix: Try to get around <optimized out>
When debugging the kernel we often face following situation in gdb:
(gdb) frame
#0 do_execveat_common (fd=fd@entry=-100, filename=0xffff888000120000, argv=argv@entry=..., envp=envp@entry=..., flags=flags@entry=0) at fs/exec.c
(gdb) info args
fd = <optimized out>
filename = 0xffff888000060000
argv = <optimized out>
envp = <optimized out>
flags = <optimized out>
file = 0x0
The problem is that the Linux kernel requires certain code to be compiled with optimizations enabled.
In this situation we can "try" to reduce the optimization for single compilation units or a subtree (try because, reducing the optimization could break the build). To do so we adapt the Makefile in the corresponding directory.
# fs/Makefile
# configure for single compilation unit
CFLAGS_exec.o := -Og
# configure for the whole subtree of where the Makefile resides
ccflags-y := -Og
After enabling optimize for debug experience -Og
we can see the following now
in gdb:
(gdb) frame
#0 do_execveat_common (fd=fd@entry=-100, filename=0xffff888000120000, argv=argv@entry=..., envp=envp@entry=..., flags=flags@entry=0) at fs/exec.c
(gdb) info args
fd = -100
filename = 0xffff888000120000
argv = {ptr = {native = 0x10c5980}}
envp = {ptr = {native = 0x10c5990}}
flags = 0
(gdb) p *filename
$3 = {name = 0xffff888000120020 "/bin/ls", uptr = 0x10c59b8 "/bin/ls", refcnt = 1, aname = 0x0, iname = 0xffff888000120020 "/bin/ls"}
(gdb) ptype filename
type = struct filename {
const char *name;
const char *uptr;
int refcnt;
struct audit_names *aname;
const char iname[];
}
Appendix: Dockerfile
for Kernel development
The following Dockerfile
provides a development environment with all the
required tools and dependencies, to re-produce all the steps of building and
debugging the Linux kernel.
FROM ubuntu:20.04
MAINTAINER Johannes Stoelp <johannes.stoelp@gmail.com>
RUN apt update \
&& DEBIAN_FRONTEND=noninteractive \
apt install \
--yes \
--no-install-recommends \
# Download & unpack.
wget \
ca-certificates \
xz-utils \
# Build tools & deps (kernel).
make \
bc \
gcc g++ \
flex bison \
libncurses-dev \
libelf-dev \
# Build tools & deps (initrd).
cpio \
# Run & debug.
qemu-system-x86 \
gdb \
cgdb \
telnet \
# Convenience.
ripgrep \
fd-find \
neovim \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean
WORKDIR /develop
Save the listing above in a file called Dockerfile
and build the docker image
as follows.
docker build -t kernel-dev
Optionally set
DOCKER_BUILDKIT=1
to use the newer image builder.
Once the image has been built, an interactive container can be launched as follows.
# Some options for conveniene:
# -v <HOST>:<GUEST> Mount host path to guest path.
# --rm Remove the container after exiting.
docker run -it kernel-dev
Alternatively use podman.
Appendix: Screencast of an example debug session
The screencast gives an example, debugging the Linux kernel using the above mentioned Dockerfile.