2023/01/14 - xpost: Cooperative-multitasking studies (matcha-threads)

This is a cross post to a cooperative-multitasking implementation that I did in the past and hosted on github >> matcha-threads <<.

Cooperative-multitasking allows to perform the thread scheduling in user space. Executing threads need to give the control back to the scheduler such that other threads can run. Since control is returned explicitly to the scheduler, threads need to "cooperate".

The following code snippet shows an example of two such threads:

#include "lib/executor.h"
#include "lib/thread.h"
#include <cstdio>

void like_tea(nMatcha::Yielder& y) {
    std::puts("like");
    y.yield();
    std::puts("tea");
}

int main() {
    nMatcha::Executor e;
    e.spawn(nMatcha::FnThread::make(like_tea));
    e.spawn(nMatcha::FnThread::make([](nMatcha::Yielder& y) {
        std::puts("I");
        y.yield();
        std::puts("green");
    }));
    e.run();
    return 0;
}

Which gives the following output when being run:

I
like
green
tea

The main focus of that project was to understand the fundamental mechanism underlying cooperative-multitasking and implement such a yield() function as shown in the example above.

Looking at the final implementation, the yield function does the following:

push callee-saved regs to current stack
swap stack pointers (current - new)
pop callee-saved regs from new stack

Implementations for different ISAs are available here:

Since a thread returns into the last stack-frame of the new thread after switching the stack pointers in the yield function, special care must be taken when a new stack is created.

The initial stack is setup such that, when yield-ing into the new stack for the first time, the stack contains the initial values for the callee-saved registers, which yield will restore and the return frame contains an address which should be returned to when returning from yield.

An example of setting up the initial stack can be seen in init_stack (x86). From this it can also be seen that the first time the thread will return to thread_create, which just calls into a function passed to init_stack.

Appendix: os-level vs user-level threading

The figure below depicts os-level threading (left) vs user-level threading (right).

The main difference is that in the case of user-level threading, the operating system (os) does not know anything about the user threads. In the concrete example, only a single user thread can run at any given time, whereas in the case of os-level threading, all threads can run truly parallel.

When a user-level thread is scheduled, the stack-pointer (sp) of the os thread is adjusted to the user threads' stack. For the example below, if the user thread A is scheduled (yielded to), the stack-pointer for the os thread S is switched to the stack A. Once the user thread yields back to the scheduler, the stack-pointer is switched back to stack S.