[root@memzero]# ls

2023/01/14 - xpost: Cooperative-multitasking studies (matcha-threads)

This is a cross post to a cooperative-multitasking implementation that I did in the past and hosted on github >> matcha-threads <<.

Cooperative-multitasking allows to perform the thread scheduling in user space. Executing threads need to give the control back to the scheduler such that other threads can run. Since control is returned explicitly to the scheduler, threads need to "cooperate".

The following code snippet shows an example of two such threads:

#include "lib/executor.h"
#include "lib/thread.h"
#include <cstdio>

void like_tea(nMatcha::Yielder& y) {
    std::puts("like");
    y.yield();
    std::puts("tea");
}

int main() {
    nMatcha::Executor e;
    e.spawn(nMatcha::FnThread::make(like_tea));
    e.spawn(nMatcha::FnThread::make([](nMatcha::Yielder& y) {
        std::puts("I");
        y.yield();
        std::puts("green");
    }));
    e.run();
    return 0;
}

Which gives the following output when being run:

I
like
green
tea

The main focus of that project was to understand the fundamental mechanism underlying cooperative-multitasking and implement such a yield() function as shown in the example above.

Looking at the final implementation, the yield function does the following:

yield:
  1. function prologue
  2. push callee-saved regs to current stack
  3. swap stack pointers (current - new)
  4. pop callee-saved regs from new stack
  5. function epilogue & return

Implementations for different ISAs are available here:

Appendix: os-level vs user-level threading

The figure below depicts os-level threading (left) vs user-level threading (right).

The main difference is that in the case of user-level threading, the operating system (os) does not now anything about the user threads. In the concrete example, only a single user thread can run at any given time, whereas in the case of os-level threading, all threads can run truly parallel.

When a user-level thread is scheduled, the stack-pointer (sp) of the os thread is adjusted to the user threads' stack. For the example below, if the user thread A is scheduled (yielded to), the stack-pointer for the os thread S is switched to the stack A. Once the user thread yields back to the scheduler, the stack-pointer is switched back to stack S.