linux-list

Linux Linked List

How to submit

To obtain the skeleton files that we have provided for you, you need to download the lab tarball, hw1.tgz, from Courseworks, under the Files tab. Inside your VM, you may unpack it with the following command:

tar xzf hw1.tgz

This will unpack into a directory named hw1, which is a git repo.

The TAs will use scripts to download, extract, build and display your code. It is essential that you DO NOT change the names of the skeleton files provided to you. If you deviate from the requirements, the grading scripts will fail and you will not receive credit for your work.

For students on ARM Mac computers (e.g. with M1 chip): if you want your submission to be built/tested for ARM, you must create and submit a file called .armpls in the top-level directory of your repo; feel free to use the following one-liner:

cd "$(git rev-parse --show-toplevel)" && touch .armpls && git add -f .armpls && git commit .armpls -m "ARM pls"

You should do this right after unpacking the tarball.

You need to have at least 5 git commits total, but we encourage you to have many more. All your work should be committed to the master branch. If you have not used git before, this tutorial can get you started.

At a minimum, your README.txt should contain the following info:

The description should indicate whether your solution for the part is working or not. You may also want to include anything else you would like to communicate to the TA grader, such as extra functionality you implemented or how you tried to fix your non-working code.

Once you have completed the assignment, you must create a tarball of your work. Everything should still be in a directory named hw1/. Create the tarball with the following:

tar czf hw1-<UNI>.tgz hw1

where <UNI> is replaced with your UNI. Upload hw1-<UNI>.tgz to Courseworks to submit your assignment.

Part 1: Reading and written assignment

Answers to written questions must be added to the part1.txt file provided. Please limit your explanations to around 40 words each.

OSTEP book: Q1 and Q2

Please read the following two chapters from OSTEP:

Then answer the following questions:

  1. What actions does a kernel perform to context switch between processes?

  2. What process state does a process enter immediately after it terminates? What is this state for, and when does it leave this state?

LKD book: Q3-Q7

Read chapter 6 of LKD (pp.85-96) on Linux’s linked list implementation.

If the LKD book’s explanation left you still confused, you can take a look at another book, Understanding the Linux Kernel, 3rd Edition, by Bovet & Cesati, (pp.87-90) for another explanation.

Read the container_of() macro code shown in LKD, page 89 carefully. Answer the following questions:

  1. What is typeof? Is it a C language keyword, function, macro, or something else? What does it do?

  2. What does offsetof do?

  3. What does the container_of() macro do?

  4. The container_of() macro, in the way it’s written in LKD, is not portable across C compilers because it uses GNU-specific language extensions. Can you identify what they are?

  5. The container_of() macro in LKD has two lines. Is the 1st line (the declaration of __mptr) really necessary? Is it possible to rewrite the macro without it? What is the purpose of that 1st line?

The answers to all these questions can be found by Googling, which you are allowed to do in this case. However, I urge you to make a serious attempt to answer these questions on your own. The point of this exercise is to get you ready for the kind of crazy C code you will encounter in the Linux kernel. Take this opportunity to flex your C legs.

If you find it difficult to answer these questions after reading the books, come back to this after you have completed the programming assignment below.

Part 2: Using Linux Linked Lists

Before starting the programming portion of this assignment, familiarize yourself with writing and using Linux kernel modules. You will be writing your own module using Linux’s linked list implementation, included from <linux/list.h>. Be sure to test your module for memory leaks using KEDR (the TAs will also check for memory leaks using KEDR while grading all module-based assignments in this course).

We define a kernel data structure to represent Pokémon:

struct pokemon {
    char name[32];
    int dex_no;
    struct list_head list;
};

Note that the struct list_head field is embedded within our data structure.

Your task is to write a kernel module that will manipulate a Pokédex, which is a list of Pokémon. On initialization, it will add some Pokémon to the list, and then print them out. On exit, it will print out those Pokémon again, then delete those Pokémon from the list.

The skeleton code contains the outline of the kernel module, including the above struct pokemon definition, and the entry and exit points, pokedex_init and pokedex_exit. You should not modify these.

Your job is to fill in the functions called by these entry and exit points. First, you should declare a single struct list_head named pokedex, which will serve as the head of the list of struct pokemons.

Then, you must fill in following functions, using the macros and functions provided by <linux/list.h>:

Note that your Pokémon will need to persist between your module’s entry point and exit point – you may find the functions kmalloc() and kfree() useful. These serve the same purpose as malloc() and free(), except in the kernel execution context. Note that kmalloc() takes an additional flag parameter – for this assignment, you may just use GFP_KERNEL for that argument. See LKD pp.238-244 for more details.

Part 3: (Not Quite) Using Linux Linked Lists

Rewrite part2 without using the macros or functions defined in <linux/list.h>. In other words, replace the calls to the following functions and macros from <linux/list.h> with your own inline code, directly manipulating pointers. None of the following functions or macros should appear in your code:

list_add_tail
list_for_each_entry
list_del
list_for_each_entry_safe
LIST_HEAD
LIST_HEAD_INIT
list_entry
container_of
offsetof
READ_ONCE
WRITE_ONCE

Do not call any linked-list related functions or macros in the linux kernel. Do not define any additional macros or functions in your code other than those already defined in the skeleton code.

Your linked-list should still be implemented using struct list_head and preserve the same semantics as the kernel’s implementation. That is, your code should add/remove/access/traverse list elements in the same way that the kernel macros/functions would.

The purpose of this exercise is to understand how the Linux linked-list implementation works under the hood. Study the implementations of the macros/functions that you used in part2. Once you understand how they work, you should be able to write the equivalent C code for this part. You may find it helpful to familiarize yourself with bootlin, a popular online kernel-source navigator. Be sure to select the correct Linux kernel version. In the Spring 2024 semester, you will be working with version v5.10.158.

Note that the linux macros/functions contain type/error-checking code to make the generic implementation safer to use. You may not need that for your type-specific code. Also, don’t worry about replicating READ/WRITE_ONCE – the kernel uses these macros to ensure safe concurrent access.

Part 4: Task List

In Linux, processes are represented by the struct task_struct, which is defined in <linux/sched.h>. Each task_struct represents a node in the process tree. To implement this tree data structure, it uses two struct list_head fields: children and sibling. children is the list head to the process’s list of children, while sibling connects each process to its siblings, as well to its parent’s children field.

For this part, the skeleton code we have provided is a minimal kernel module boilerplate. Your task is to write a kernel module, tasklist. Upon initialization, it performs a depth-first traversal of the process tree, and prints out the process tree. You’re not supposed to use recursion in the linux kernel, but for the sake of simplicity, it’s fine for this assignment. This module should do nothing on exit.

You should start at the root of the process tree, which you may access via the struct task_struct init_task variable exposed by <linux/sched/task.h>. For each process, you should print out its PID and the process name. You may obtain these from the pid and comm fields of the task_struct.

When you print out the processes, you should also display the hierarchical tree. Here is an example output:

[ 9303.501734] Removing Module
[ 9304.846569] Loading Module
[ 9304.846574] -- [0] swapper/0
[ 9304.846576]     \_ [1] systemd
[ 9304.846578]         \_ [95] systemd-journal
[ 9304.846579]         \_ [119] systemd-udevd
[ 9304.846581]             \_ [14948] systemd-udevd
[ 9304.846583]         \_ [150] VBoxService
[ 9304.846585]         \_ [152] systemd-logind
[ 9304.846586]         \_ [153] avahi-daemon
[ 9304.846588]             \_ [160] avahi-daemon
[ 9304.846590]         \_ [154] dbus-daemon
[ 9304.846592]         \_ [155] dhcpcd
[ 9304.846593]         \_ [162] login
[ 9304.846595]             \_ [365] bash
[ 9304.846597]                 \_ [441] startx
[ 9304.846599]                     \_ [458] xinit
[ 9304.846601]                         \_ [459] X
[ 9304.846603]                         \_ [462] sh
[ 9304.846605]                             \_ [474] xfce4-session
[ 9304.846607]                                 \_ [488] xfwm4
[ 9304.846609]                                 \_ [490] Thunar
[ 9304.846611]                                 \_ [491] xfce4-panel
[ 9304.846614]                                     \_ [512] panel-16-systra
[ 9304.846616]                                     \_ [516] panel-17-action
[ 9304.846618]                                 \_ [493] xfdesktop
[ 9304.846621]                                 \_ [496] xfce4-terminal
[ 9304.846623]                                     \_ [556] gnome-pty-helpe
[ 9304.846625]                                     \_ [558] bash
[ 9304.846627]                                     \_ [570] bash
[ 9304.846629]                                     \_ [581] bash
[ 9304.846631]                                     \_ [587] bash
[ 9304.846633]                                         \_ [15189] sudo
[ 9304.846635]                                             \_ [15190] insmod
[ 9304.846638]                                     \_ [8939] bash
[ 9304.846639]         \_ [167] sshd
[ 9304.846641]         \_ [182] rpcbind
[ 9304.846642]         \_ [193] ntpd
[ 9304.846644]         \_ [174] automount
[ 9304.846645]         \_ [195] rpc.gssd
[ 9304.846646]         \_ [362] systemd
[ 9304.846648]             \_ [364] (sd-pam)
[ 9304.846649]         \_ [468] dbus-daemon
[ 9304.846651]         \_ [467] dbus-launch
[ 9304.846652]         \_ [476] polkitd
[ 9304.846654]         \_ [484] xfconfd
[ 9304.846655]         \_ [487] gpg-agent
[ 9304.846657]         \_ [494] xfsettingsd
[ 9304.846659]         \_ [503] xfce4-power-man
[ 9304.846660]         \_ [525] upowerd
[ 9304.846661]         \_ [548] VBoxClient
[ 9304.846662]         \_ [561] VBoxClient
[ 9304.846663]         \_ [578] VBoxClient
[ 9304.846664]         \_ [593] VBoxClient
[ 9304.846665]         \_ [632] rpc.statd
[ 9304.846667]         \_ [1233] at-spi-bus-laun
[ 9304.846668]         \_ [1220] firefox
[ 9304.846669]     \_ [2] kthreadd
[ 9304.846670]         \_ [3] ksoftirqd/0
[ 9304.846671]         \_ [5] kworker/0:0H
[ 9304.846672]         \_ [6] kworker/u2:0
[ 9304.846673]         \_ [7] migration/0
[ 9304.846674]         \_ [8] rcu_bh
[ 9304.846675]         \_ [9] rcu_sched
[ 9304.846676]         \_ [10] watchdog/0
[ 9304.846677]         \_ [11] khelper
[ 9304.846679]         \_ [12] kdevtmpfs
[ 9304.846680]         \_ [13] netns
[ 9304.846681]         \_ [14] writeback
[ 9304.846682]         \_ [15] bioset
[ 9304.846683]         \_ [16] kblockd
[ 9304.846684]         \_ [18] khungtaskd
[ 9304.846685]         \_ [19] kswapd0
[ 9304.846686]         \_ [20] ksmd
[ 9304.846687]         \_ [21] khugepaged
[ 9304.846688]         \_ [22] fsnotify_mark
[ 9304.846689]         \_ [23] crypto
[ 9304.846690]         \_ [27] kthrotld
[ 9304.846691]         \_ [28] deferwq
[ 9304.846693]         \_ [52] ata_sff
[ 9304.846694]         \_ [53] khubd
[ 9304.846695]         \_ [54] scsi_eh_0
[ 9304.846696]         \_ [55] scsi_eh_1
[ 9304.846697]         \_ [56] kworker/u2:2
[ 9304.846698]         \_ [57] scsi_eh_2
[ 9304.846699]         \_ [64] kworker/0:1H
[ 9304.846700]         \_ [72] jbd2/sda1-8
[ 9304.846701]         \_ [73] ext4-dio-unwrit
[ 9304.846702]         \_ [99] rpciod
[ 9304.846703]         \_ [107] nfsiod
[ 9304.846704]         \_ [112] iprt
[ 9304.846705]         \_ [142] kpsmoused
[ 9304.846706]         \_ [8798] lockd
[ 9304.846707]         \_ [11501] kworker/0:1
[ 9304.846709]         \_ [12139] kworker/0:2
[ 9304.846710]         \_ [12956] kworker/0:3

The format doesn’t have to be exactly the same, but the output must clearly show the hierarchical tree.

To avoid printing too much whitespace to your kernel log buffer, you must set a maximum indentation level of 20. This means that processes deeper down in the tree will only receive 20 levels worth of indentation. Use the following formula to calculate the indentation per level, where each indent is 4 spaces: num_spaces = 4 * min(cur_level, 20).

You can test your module by running the following command: ps -e --forest. This will display a process tree which you can compare against your module’s output. Note that ps doesn’t show the swapper task, so the output will look slightly different. You can also start new processes and verify that your module displays them.


Acknowledgements

Parts 2 and 4 adapted from the programming projects in chapters 2 and 3 of Operating Systems Concepts by Abraham Silberschatz, Peter B. Galvin, and Greg Gagne.

This assignment was originally developed for use by Jae Woo Lee and previous OS TAs.


Last updated: 2024-01-18