In systems software, a scheduler is used to decide which processes should be running. This is especially important in multi-process systems. At minimum, the scheduling loop:
- Pauses a currently running process.
- Saves its state, so it can be restored later. The state at minimum consists of all current register states.
- Gets the next process to run.
- Loads the next process’ state and lets that run.
This swap process is called context switching. Executing this is largely overhead, so we want it to be as fast as possible (doable with hardware support). The goal of good schedulers is to optimise turnaround time (time to complete a process) and minimise response time (average time from submission until the first response).
Mechanisms
We define a few things:
- Burst time — the total amount of time required by the CPU to execute the whole process. Note that this isn’t practically possible to calculate the execution time before executing, so we don’t actually implement burst time-based scheduling mechanisms in practice.
- Starvation — a process that began might never actually finish. This can be a result of the scheduling policy.
The waiting time for a task is given by:
There are several basic scheduling mechanisms (without pre-emption):
- First-come first-served (FCFS) — the first process to arrive gets the CPU. Subsequent processes are stored in arrival order in a FIFO queue.
- Shortest job first (SJF) — we always schedule the job with the shortest burst time first, and assume no pre-emption.
- This is provably optimal at minimising wait time (if no pre-emption).
- Note that long jobs may be starved (i.e., they may never execute).
- Shortest remaining time first (SRTF) — in this case, we allow pre-emptions. Any process that comes in with the shortest amount of time remaining is scheduled (i.e., a priority queue). If a shorter process enters and the currently running process needs more time, the shorter process takes the CPU.
- This can again starve long jobs.
The downfalls of these motivate the idea of more complicated scheduling mechanisms:
- Round-robin (RR) — divides execution into time slices (or quanta) with a given length . We maintain a queue of processes, but only execute each process for a maximum of time units at a time. If it’s still running, then we pre-empt and re-add to the back of the queue.
- Fair scheduling — where it aims for a proportional amount of time each.
- Scheduling for multiprocessor systems — this is increasingly important, as modern CPUs are increasingly parallel and GPUs are massively parallel. This thus requires a new approach to scheduling.
- Multi-level feedback queue (MLFQ) — uses several distinct queues (one for each priority level), each with a different priority level. Each queue might define a different time slice length.
- If a job enters the system, it’s placed at the highest priority.
- If a process uses an entire time slice while running, its priority is reduced. Then, if a process gives up the CPU before the time slice is done, its priority is reduced.
- After some time , we move all jobs to the topmost queue (to prevent starvation).
Priorities
We could also assign a priority to different processes and do round-robin with higher priority processes. In Linux, lower (-20 min) to higher (19 max) numbers represent higher to lower priority. This may again starve processes if there’s a lot of higher priority processes. One way we can mitigate this is to dynamically change the priorities often (and older processes that haven’t been executed in a long time can increase priority). This is done by observing how things are going so far.
We can accidentally change the priority of a low priority process to a high one, caused by dependencies associated with a high priority depending on a low priority. One solution to this is priority inheritance, where we inherit the highest priority of the waiting processes and chain together multiple inheritances if needed. We can revert back to the original priority after dependency.
We should also separate foreground processes (interactable, need good response time), and background processes (good response time unnecessary, just throughput). Interactable processes may tend to return the CPU frequently (especially if waiting for an IO input, for example).
Programs can in theory set their own priorities. (niceness).
Context switch
In theory, all the context switch needs to do is save a few register values for the currently executing process somewhere (like onto its kernel stack, for x86). This means saving the program counter, and the position of the stack/code in memory.