Cooperative Multitasking
The idea of cooperative multitasking is that, at certain points in the task's execution, it voluntarily yields control back to the kernel. This is done by invoking a system call, typically called yield
. In other cases, if the task invokes a system call that blocks, this is also considered a yield. The kernel then decides which task to run next, and returns control to that task.
The advantage of cooperative multitasking is that it is very simple to implement. The disadvantage is that if a task does not yield, it will never be preempted. This means that a single task can monopolize the CPU, and the system will become unresponsive. In preemptive multitasking, the kernel uses a timer to interrupt the currently running task, and preempt it if its time slice has expired, or if a higher priority task is ready to run. This ensures that no task can monopolize the CPU. We'll see how to implement preemptive multitasking later.
Scheduling
We'll add a new kernel component called the scheduler. The scheduler is responsible for keeping track of all the tasks in the system, and deciding which task to run next. It is invoked by the kernel at certain points, such as when a task yields or blocks. This means it needs to keep track of the currently running task, and the list of ready tasks. Upon invocation, it will decide which task to run next based on some strategy, and switch to that task. The simplest strategy is round-robin scheduling, where the scheduler simply runs each task in turn. There are many other strategies, but we'll start with round-robin.
To make it easy to manage the ready tasks, we'll use a task queue. The current task will be stored in a global variable, outside the queue. Here's how we're going to make scheduling decisions:
- When the scheduler is invoked (e.g. when a task yields), it will add the current task to the end of the queue, and then remove the first task from the queue, assign it to the current task, and switch to it.
- If the queue is empty, the scheduler will simply return, and the current task will continue running.
- When a task exits, we simply won't add it back to the queue, and the next task in the queue will be run.
- If a task exits and there are no more tasks in the queue, we'll simply halt the CPU.
Before we start implementing the scheduler, we need to make some changes to the Task
type. We need to track the task state, which can be New
, Ready
, Running
, or Terminated
(for now). I'm also going to change the layout of the Task
type by making the rsp
field the first field, so that we can easily access it from inline assembly later. Here's the updated tasks.nim
module:
# src/kernel/tasks.nim
type
TaskState = enum
New, Ready, Running, Terminated
Task = object
rsp: ptr uint64
state: TaskState
# other fields...
Let's start by creating a new module called sched.nim
in the kernel
directory. We'll define a task queue, a global variable to store the current task, and a proc to add a task to the queue:
# src/kernel/sched.nim
import std/deques
import tasks
var
readyTasks = initDeque[Task]()
currentTask* {.exportc.}: Task
proc addTask*(t: Task) =
readyTasks.addLast(t)
If you remember, we already had defined a currentTask
in the tasks.nim
module. We'll make some changes to that module later when we get to context switching. I'm annotating currentTask
with the exportc
pragma since we'll need to access it from inline assembly later. Let's now add the main scheduler proc:
# src/kernel/sched.nim
...
proc schedule*() =
if readyTasks.len == 0:
if currentTask.isNil or currentTask.state == Terminated:
debugln &"sched: no tasks to run, halting"
halt()
else:
# no ready tasks, keep running the current task
return
if not (currentTask.isNil or currentTask.state == Terminated):
# put the current task back into the queue
currentTask.state = TaskState.Ready
readyTasks.addLast(currentTask)
# switch to the first task in the queue
var nextTask = readyTasks.popFirst()
switchTo(nextTask)
The current implementation of switchTo
(in tasks.nim
) only knows how to switch to a new task. We'll need to change it to perform an actual context switch.
Context Switching
When switching between tasks, we need to save the state of the currently running task, and restore the state of the task that is about to run. The task state typically includes the CPU registers and the stack pointer. We don't need to save the instruction pointer, because once we swap the stack pointers, the old task resumes execution at the same point where its stack pointer was swapped out previously, and will continue as if nothing had happened. It's this swapping of stack pointers that causes the context switch.
Let's create a new module ctxswitch.nim
to handle context switching. We'll move the switchTo
proc from tasks.nim
to ctxswitch.nim
, and modify it to handle switching between tasks.
When the current task is not nil
or terminated, we'll save its stack pointer and register state, regardless of whether we're switching to a new task or not. When we're switching to a new task, we'll simply load the new task's stack pointer and iretq
to return to user mode. When we're switching to an existing task, we'll restore its stack pointer and register state and return normally.
Here's the modified switchTo
proc:
# src/kernel/ctxswitch.nim
import cpu
import gdt
import tasks
import vmm
var
currentTask {.importc.}: Task
proc switchTo*(next: var Task) =
tss.rsp0 = next.kstack.bottom
setActivePML4(next.space.pml4)
if not (currentTask.isNil or currentTask.state == TaskState.Terminated):
pushRegs()
asm """
mov %0, rsp
: "=m" (`currentTask`->rsp)
"""
currentTask = next
case next.state
of TaskState.New:
next.state = TaskState.Running
asm """
mov rsp, %0
iretq
:
: "m" (`currentTask`->rsp)
"""
else:
next.state = TaskState.Running
asm """
mov rsp, %0
:
: "m" (`currentTask`->rsp)
"""
popRegs()
Let's define pushRegs
and popRegs
, but instead of defining them in this module, we'll put them in the cpu.nim
module, where they belong. Here, I'll be using Nim templates instead of procs to avoid the overhead of calling a proc.
# src/kernel/cpu.nim
template pushRegs*() =
asm """
push rax
push rbx
push rcx
push rdx
push rsi
push rdi
push rbp
push r8
push r9
push r10
push r11
push r12
push r13
push r14
push r15
"""
template popRegs*() =
asm """
pop r15
pop r14
pop r13
pop r12
pop r11
pop r10
pop r9
pop r8
pop rbp
pop rdi
pop rsi
pop rdx
pop rcx
pop rbx
pop rax
"""
yield
System Call
To allow a task to yield control to the kernel, we'll add a new system call called yield
. When a task invokes this system call, the kernel simply calls the scheduler to switch to the next task. Let's add it to the syscall.nim
module:
# src/kernel/syscalls.nim
...
proc `yield`*(args: ptr SyscallArgs): uint64 {.cdecl.} =
debugln &"syscall: yield"
schedule()
proc syscallInit*() =
# set up syscall table
syscallTable[1] = exit
syscallTable[2] = print
syscallTable[3] = `yield`
...
Notice that we have to quote the yield
proc name because it's a reserved keyword in Nim. Now, tasks can invoke syscall 3 (with no arguments) to yield control to the kernel. Let's add this syscall to our user task:
# src/user/utask.nim
...
proc UserMain*() {.exportc.} =
NimMain()
asm """
# call print
...
# call yield
mov rdi, 3
syscall
# call exit
...
"""
The task now will print something, yield control to the kernel, and then exit.
Handling Task Exits
When the current task calls the exit
system call, we should also invoke the scheduler, but the task shouldn't be put back into the queue. We can do this by setting the task's state to Terminated
before invoking the scheduler. Terminating a task may involve other steps later (e.g. freeing its memory), so let's add a terminateTask
proc to the tasks.nim
module:
# src/kernel/tasks.nim
proc terminateTask*(t: var Task) =
t.state = TaskState.Terminated
# other cleanup...
Now, let's modify the exit
syscall to call terminateTask
before invoking the scheduler:
# src/kernel/syscalls.nim
proc exit*(args: ptr SyscallArgs): uint64 {.cdecl.} =
debugln &"syscall: exit: code={args.arg1}"
terminateTask(currentTask)
schedule()
Running multiple tasks
Let's now create two user tasks, add them to the task queue, and invoke the scheduler.
# src/kernel/main.nim
...
proc KernelMainInner(bootInfo: ptr BootInfo) =
...
debugln "kernel: Creating user tasks"
var task1 = createTask(
imagePhysAddr = bootInfo.userImagePhysicalBase.PhysAddr,
imagePageCount = bootInfo.userImagePages,
)
var task2 = createTask(
imagePhysAddr = bootInfo.userImagePhysicalBase.PhysAddr,
imagePageCount = bootInfo.userImagePages,
)
debugln "kernel: Adding tasks to scheduler"
sched.addTask(task1)
sched.addTask(task2)
debugln "kernel: Starting scheduler"
sched.schedule()
Let's run it and see what happens.
kernel: Creating user tasks
kernel: Applying relocations to user image
kernel: Applying relocations to user image
kernel: Adding tasks to scheduler
kernel: Starting scheduler
sched: switching -> 0
syscall: print
Hello from user mode!
syscall: yield
sched: switching 0 -> 1
syscall: print
Hello from user mode!
syscall: yield
sched: switching 1 -> 0
syscall: exit: code=0
sched: switching 0 -> 1
syscall: exit: code=0
sched: no tasks to run, halting
It works! The scheduler first runs task 0, which prints a message, then yields control to the kernel. The scheduler then switches to task 1, which prints another message, and then yields control back to the kernel. The scheduler then switches back to task 0, which calls exit
, which terminates the task. The scheduler then switches to task 1, which also calls exit
and terminates the task. Since there are no more tasks in the queue, the scheduler halts the CPU.
Let's add a third task for fun and see what happens.
kernel: Adding tasks to scheduler
kernel: Starting scheduler
sched: switching -> 0
syscall: print
Hello from user mode!
syscall: yield
sched: switching 0 -> 1
syscall: print
Hello from user mode!
syscall: yield
sched: switching 1 -> 2
syscall: print
Hello from user mode!
syscall: yield
sched: switching 2 -> 0
syscall: exit: code=0
sched: switching 0 -> 1
syscall: exit: code=0
sched: switching 1 -> 2
syscall: exit: code=0
sched: no tasks to run, halting
It works! The scheduler runs all three tasks in a round-robin fashion, and then halts the CPU when there are no more tasks to run. This is exciting! We now have a simple cooperative multitasking system.
Now that we have a few system calls, it's time to add a system library that can be used by user tasks to invoke these system calls, instead of using inline assembly. We'll do that in the next chapter.