Operating Systems

Remzi H. Arpaci-Dusseau Andrea C. Arpaci-Dusseau

Virtualization

The OS can virtualize the CPU into the processes thinking like they have the entirety of the CPU just for them. In reality, processes run separately and not at the same time, the OS can stop a process, run another and then come back to the same process and it does this so fast that it creates the illusion that the process has the CPU just for itself. This process is called time sharing of the CPU. To perform this, the OS performs some low-level mechanisms, and these mechanisms only run based on policies (for example, which process has run more over the last minute).

4. The Process

Process address space -> the memory region which the process can address.

Process API

An OS must implement an API to be able to perform the following operation to a process:

  • Create - everytime a program is started, a process is created
  • Destroy - of course, many processes will run and just exit by themselves when complete; when they don’t, the user may wish to kill them, and thus an interface to halt a runaway process is quite useful.
  • Wait - sometimes it is useful to wait for a process to stop running
  • Miscellaneous Control - most operating systems provide some kind of method to suspend a process (stop it from running for a while) and then resume it (continue it running)
  • Status - an interface to get some status information about a process, such as how long it has run for, or what state it is in

Process States

  • Running - a processor is executing the instructions of the current process
  • Ready - the process is ready to run but for some reason the OS has chosen not to run it at the moment
  • Blocked - process has performed some kind of operation that makes it not ready to run until some other event takes place. A common example: when a process initiates an I/O request to a disk, it becomes blocked and thus some other process can use the processor.

A process can be moved between the ready and running states at the discretion of the OS. Being moved from ready to running means the process has been scheduled; being moved from running to ready means the process has been descheduled. Once a process has become blocked (e.g., by initiating an I/O operation), the OS will keep it as such until some event occurs (e.g., I/O completion); at that point, the process moves to the ready state again (and potentially immediately to running again, if the OS so decides).

Time Process0 Process1 Notes
1 Running Ready
2 Running Ready
3 Running Ready Process0 initiates I/O
4 Blocked Running Process0 is blocked,
5 Blocked Running so Process1 runs
6 Blocked Running
7 Ready Running I/O done
8 Ready Running Process1 now done
9 Running
10 Running Process0 now done

Data Structures

The xv6 kernel process structure is very similar to the ones in Linux, MacOS, Windows, etc.

// the registers xv6 will save and restore
// to stop and subsequently restart a process
struct context {
    int eip;
    int esp;
    int ebx;
    int ecx;
    int edx;
    int esi;
    int edi;
    int ebp;
};

// the different states a process can be in
enum proc_state { UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE };

// the information xv6 tracks about each process
// including its register context and state
struct proc {
    char *mem;                      // Start of process memory
    uint sz;                        // Size of process memory
    char *kstack;                   // Bottom of kernel stack
                                    // for this process
    enum proc_state state;          // Process state
    int pid;                        // Process ID
    struct proc *parent;            // Parent process
    void *chan;                     // If non-zero, sleeping on chan
    int killed;                     // If non-zero, have been killed
    struct file *ofile[NOFILE];     // Open files
    struct inode *cwd;              // Current directory
    struct context context;         // Switch here to run process
    struct trapframe *tf;           // Trap frame for the
                                    // current interrupt
};

Sometimes people refer to the individual structure that stores information about a process as a Process Control Block (PCB), a fancy way of talking about a C structure that contains information about each process.

todo homework

5. Process API

  • fork() - the fork system call is used to create a new process. the process that is created is an (almost) exact copy of the calling process
  • wait() - the wait system call allows for the parent process to wait for the child process to return
  • the exec() family - this system call is useful when you want to run a program that is different from the calling one. given the name of an executable, and some arguments, it loads code (and static data) from that executable and overwrites its current code segment (and current static data) with it; the heap and stack and other parts of the memory space of the program are re-initialized. Then the OS simply runs that program, passing in any arguments as the argv of that process. Thus, it does not create a new process; rather, it transforms the currently running program into a different running program. After the exec() in the child, it is almost as if the calling program never ran; a successful call to exec() never returns.

The shell is just a user program. It shows you a prompt and then waits for you to type something into it. You then type a command (i.e., the name of an executable program, plus any arguments) into it; in most cases, the shell then figures out where in the file system the executable resides, calls fork() to create a new child process to run the command, calls some variant of exec() to run the command, and then waits for the command to complete by calling wait(). When the child completes, the shell returns from wait() and prints out a prompt again, ready for your next command.

todo homework

6. Mechanism: Limited Direct Execution

7. Scheduling

8. Scheduling: MLFQ (Multi-Level Feedback Queue)

9. Scheduling: Propotional Share

10. Multiprocesser Scheduling (Advanced)

13. The Abstraction: Address Spaces

14. Interlude: Memory API

Library calls:

  • malloc - allocate memory regions on the heap (returns a pointer to that location)
  • free - free a memory region when no longer needed to be used
  • calloc - allocate memory regions on the heap and zero out the memory (returns a pointer to that location)
  • realloc - allocate a new larger region and copy the old region to it (returns the pointer to the new location)

System calls:

  • mmap - create an anonymous memory region
  • brk - change the location of the program's break (the end of the heap)

todo homework

15. Address Translation

Hardware-based address translation -> technique implemented by the OS to map virtual memory addresses to physical ones

A technique called Dynamic Relocation or base and bounds

For what a process knows, it has the memory all for itself, but when it starts running, the OS decides where its address space should be on the memory. So, its real physical memory address is:

physical address = virtual addresss + base