Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
PennOS
Operating Systems Design and
Implementation (Summer 2025)
PennOS
Overview
The following specification provides a road map for completing this project; however, as you develop your code, you may find it necessary to expand upon the specification. Feel free to add additional shell/user level functions to support debugging, as long as you mostly adhere to the provided interfaces in this document. Not only is this encouraged, but it is expected. If you are ever in doubt about a design decision, ask your friendly TAs, and be sure to document such changes in your README and companion document.
Specification
To be clear, there should be a difference between system calls and user level functions. In a real system there is usually a run-time mechanism (such as invoking some special assembly and changing permission levels), but in pennOS both user level and system level functions are just functions. You will still have to maintain a separation of sorts, but this will be discussed more in a later section of the document.
In the previous projects, you have interacted with the operating system at the user land level, and in this project, you will take a peek behind the curtain at kernel land. Well, not exactly, but you will simulate the basic functionality of an operating system by programming your own operating system simulator PennOS. Using the pthread library, you will implement a basic priority scheduler; additionally, you will implement a FAT file system for your operating system to mount, and a basic shell and programming API for a user to interact with your operating system.
Another critical part of an operating system is handling file reads and writes. Your operating system will mount a single file system, PennFAT: a simple file system implementation based on FAT. Your PennFAT implementation will be stored within a single file on the host file system, and will be mounted by PennOS in a loop back like manner. Additionally, unlike traditional file systems, PennFAT is only required to handle files within a single top level directory. You are required to allow the creation, modification, and removal of files under the top level directory.
The last part of your operating system is providing user land interaction via a simple shell. You will program this shell using the user land system calls providing by PennOS. Your shell will provide job control, stdin/stdout redirection, and a functional set of built-in commands for testing and exploring your operating system.
One last note: this is a long and complex project that will take you many, many hours to complete. This document can only provide you with a primer of the innumerable challenges you will face; read it carefully, but be sure to use other resources as well. In particular, read the manual pages, and ask questions of your friendly TAs. Likely, this will be the largest program you have yet written as a student. Devise a game plan, map out functions together, design how the various routines of PennOS will interact, and assign each other tasks thoughtfully so the pieces fit together at the end.
Remember, a lot of small, easy programs equal one large, complex program. Don’t try to conquer the entire project in one go: it is organized such that splitting up tasks into discrete components and writing modular code are strategies that will be rewarded.While splitting up components is useful and writing modular code can be rewarded, this is often interpreted incorrectly.
Students may specialize in the kernel while others specialize in the FAT while still contributing by writting certain functions in the various portions of PennOS.
Kernels and shells that are designed without thinking about the FAT (and vice-versa to a lesser extent) often need a lot of refactoring to get the two parts to cooperate.
To write modular code is to write helper functions and use them. Do not implement the PennFAT cat function as one 600+ line function. That function ends up being a mess and taking a lot of time to debug. Instead one could write helper functions that perform functionality that will probably need to be reused again in other functions.
For example, functions like k_write (generic function to write to a file in PennFAT) or functions like k_get_block(int n) which get the n’th block number of a file. These are examples that you do not have to follow, but the cat function (and others) could use these helpers to simplify their code.
Function Terminology
On the other hand, unless otherwise forbidden, you may use any userspace library functions you want (such as those found in string.h). Libraries that implement kernel/file system functionalities described in this write-up, e.g <sys/wait.h>, are forbidden.
Your operating system will provide a number of different functions and interfaces. The following prefixes indicate how the functions are to be used in order to mimic kernel- and user-level abstractions in this project:
|
Prefix |
Abstraction Level |
Can call… |
|
u_ |
User-level |
Can call other u_ or s_ functions in implementation |
|
s_ |
System call |
Can call other s_ or k_ functions in implementation |
|
k_ |
Kernel-level |
Can call other s_ or k_ functions in implementation |
The function declarations provided later in this specification are suggestions for you to build upon. You will find it useful to add additional kernel/user-level functions or system calls as you see fit, as long as they are in the spirit of the assignment.
Maintaining the Abstraction
You should NOT expose any kernel-level (k_) functions to the user (e.g. the shell). This can be done through careful inclusion of header files. For example: say you have two header/code file pairs, one named kernel.h and kernel.c that contain your k_ functions and one named sys_calls.h and sys_calls.c that contain your s_ functions (which call your k_ functions).
In order to maintain the abstraction, you would put the line #include "kernel.h" in the sys_call.c file instead of sys_call.h so that when you include sys_call.h in your shell, you have can call system calls, but not kernel-level functions.
You should also be careful to not expose any file system internal functions to the kernel and vice versa in order to avoid breaking the abstractions in between them. Your kernel may still need to call “interface” functions of your File System, but the kernel should not be directly manipulating the FAT or diving too into the details of the file system. Everything the Kernel may need from the file system should be achievable by just calling a corresponding file system function.
1. PennOS Kernel
To accomplish this task you will be using a provided wrapper around the pthread library called spthread. If you are issuing calls to fork(2), exec(2), or wait(2), you are doing something very wrong.
● a handle to the spthread
● PID, parent PID, child PID(s)
● open file descriptors
● priority level
● process state
If you would like, you can refer to Steven’s Advanced Programming in the UNIX Environment and Tanenbaum’s Modern Operating Systems for more information about data normally stored in a PCB.
1.1 Process-Related Required Functions
Reminder: you do not need to use these functions exactly as they are described. You are free to make modifications on the parameters they take, but a function doing this functionality should be somewhere in your code.
Note that these functions are a starting point. Again, feel free to add more functions as needed, but be sure to maintain the abstraction between user-level and kernel-level.
1.2 Signals in PennOS
● P_SIGSTOP: a thread receiving this signal should be stopped● P_SIGCONT: a thread receiving this signal should be continued● P_SIGTERM: a thread receiving this signal should be terminated
If you would like to add additional signals, be sure to document them and their functionality in your companion document file.
To be clear, you should not need to call the signal function, pthread_kill or other such functions to deliver these signals through the host operating system. Your code should instead handle these “signals” their own way, even if they are not used like how we have used signals in past assignments.
1.3 Process Statuses
PennOS will provide at least the following user level functions/macros that will return booleans based on the status returned from s_waitpid.
1.4 Init, Zombies and Waiting
To do this, we highly recommend having an INIT process similar to how unix does. The INIT process is the ancestor of all processes and is started on boot. That means it is the first process that your OS should start up, and then init will create the shell process. One main thing that the init process does, is that all orphan child processes will have INIT become their new parent and init will clean up those processes (through something like s_waitpid()).
Additionally, as noted above, other child process state changes can cause a s_waitpid() to return. In UNIX, a child process that transitions from running to stopped would issue a SIGCHLD signal. Your operating system should have similar functionality for parent process to learn of similar state changes. In your PennOS kernel-land implementation of s_waitpid(), you may find it useful to maintain other queues of “waitable” children, not just zombie children.
1.5 Priority Scheduler
Your implementation of PennOS should assume a single core CPU. Your kernel should be running only 1 thread at any given point in time. However, to create the illusion of concurrently running multiple processes, it will switch between running different threads rapidly.
1.6 Programming with spthreads
spthread is a wrapper that we have implemented around the pthread library. This wrapper allows us to have better control of scheduling execution order of the threads and switch between them.
We have provided a sample program that demonstrates how to switch between them in around-robin fashion. Familiarize yourself with this code and how spthread works. Since this is a wrapper around the pthread library with added helpers, you may find documentation and example code on pthread helpful as well.
● What happens when a signal is delivered?● How do you stop and start a specific thread?● How can a thread temporarily prevent being suspended by another thread?● How do you track the currently running process?● How can you pass different numbers and types of arguments to spawned processes?
You will schedule a SIGALRM signal to be delivered to your operating system every 100 milliseconds. We refer to this event as a clock tick, and on every tick, PennOS will run the
We will refer to the period between two clock ticks as a quantum (quanta for plural).
Your operating system will have three priority levels: 0, 1, and 2. As in standard UNIX, the lower a priority level, the more heavily the “process” should be scheduled Child threads created by k_proc_create should have a default priority level of 1, unless otherwise specified. For example, the shell should execute with priority level 0 because it is interactive.
Your priority queues will be relative. Threads scheduled with level 0 should run 1.5 times more often as threads scheduled with priority level 1, which run 1.5 times more often as threads scheduled with priority level 2. This means that threads scheduled with level 0 will run 2.25 times more often as threads scheduled with priority level 2.
Within each priority queue, the threads are selected in a round robin format, and each queue must be implemented as a linked list or vector. You may reuse your linked list or vector implementation from previous projects. The main thing is that you are not allowed to use fixed size arrays for your queues. You must also ensure that no thread is starved.
As an example, consider the scheduling of the following threads with these priority levels:
● shell priority level 0● sleep, priority level 1● cat, priority level 1
Given each quantum is 100 milliseconds, after 10 seconds, what is the proportion of quanta for each process?
To verify the correctness of your scheduler, you will implement a detailed logging facility that will generate a log entry for every clock tick. The specification of the log format is described in the section PennOS Logging.
When all processes in the scheduler queues are blocked, the scheduler needs to decide what to do. It should not keep busy waiting in a while loop until a process is ready to be scheduled as this would spike the CPU usage of PennOS up to 100%, when it should actually be 0% (since the OS isn’t actually doing anything).
To get past this, you should leverage sigsuspend(2), which will suspend the calling process until a signal is delivered to it, resulting in 0% CPU usage.
The scheduler in your kernel should idle only when all other processes are blocked and the scheduler has no real process to schedule. Correct usage of idling is as follows: The shell spawns a new process sleep 200. The child process for sleep will be blocked for 200 clockticks, and the shell will be blocked waiting for sleep to terminate. At this point, all the scheduler queues are empty and for each clock tick until sleep finishes, the scheduler can idle.
Your operating system will mount its own file system, PennFAT, based on FAT16. (For references, see Chapter 4.3.2 in Tanenbaum 4th edition, pages 285-286.) You are only required to implement the root directory of PennFAT.
A FAT file system has mainly two regions: the FAT region which hosts the File Allocation Table (FAT), and the data region which stores the files (including directory files). The block numbering of PennFAT’s data region begins at 1.
Conceptually, you can think of a file as a sequence of fixed-size blocks and a file system as a way to find the right blocks for a file in the right order. A FAT provides a simple way to do this by storing links between physical blocks of a file. Placed at the beginning of the file system before any block, the FAT has a predetermined size and can be easily mapped into memory. This is also its greatest disadvantage: being of fixed size, the FAT also limits the size of the file system.
The FAT of PennFAT consists of a predetermined number of entries. It is specified when the filesystem is first formatted using the mkfs command, which dictates how the FAT and your file system as a whole should be formatted. The least significant byte (LSB) of the first entry of the FAT specifies the size of each block with the mapping displayed below:
|
LSB |
Size (bytes) |
|
0 |
256 |
|
1 |
512 |
|
2 |
1024 |
|
3 |
2048 |
|
4 |
4096 |
FAT size = block size * number of blocks in FATand with each entry of the FAT consisting of 2 bytes, the number of entries in your FAT will be:
|
Index |
Link |
|
0
|
0x2004 <— MSB=0x20 (32 blocks in FAT), LSB=0x04 (4K-byte block size)
|
|
1
|
0xFFFF <— Block 1 is the only block in the root directory file
|
|
2
|
5 <— File A starts with Block 2 followed by Block 5
|
|
3
|
4 <— File B starts with Block 3 followed by Block 4
|
|
4
|
0xFFFF <— last block of File B
|
|
5
|
6 <— File A continues to Block 6
|
|
6
|
0xFFFF <— last block of File A
|
|
…
|
… |
Like other FAT16 variants, 0 denotes a free block, and 0xFFFF the last block of a file. (Thus, themaximum possible block number for PennFAT is 2^16 − 2)
With each FAT entry representing a file block, there will be a total of:
● 0: end of directory● 1: deleted entry; the file is also deleted● 2: deleted entry; the file is still being used● Note Above are literally values 0, 1, and 2. NOT characters ‘0’, ‘1’, and ‘2’!
● 0: unknown● 1: a regular file● 2: a directory file● 4: a symbolic link● Note Read more about symbolic links here. You will need this in extra credit.
● 0: none● 2: write only● 4: read only● 5: read and executable (shell scripts)● 6: read and write● 7: read, write, and executable● time_t mtime: creation/modification time as returned by time(2) in Linux
where fat_size is the size of the FAT and block_size is the size of a block
2.3 Standalone PennFAT
Your file system itself will exist as a single binary file on the host OS. A simple shell starter code (prompt, read input, execute command) will be provided for you to fully implement in order to build your standalone pennFAT, with its appropriate function calls. This will allow for interaction with your file system without an operating system implementation. Note this is only the pseudocode, you should be filling this in with your code and also be implementing the PennFAT routines. The information about routines can be found in section 2.3.6.
You will provide a single standalone program, separate from the final PennOS executable, called pennfat, which doesn’t have any arguments.
Just like the previous shells you’ve made so far, your Standalone PennFAT should ignore SIGINT and SIGTSTP, and only exit on Ctrl+D.
Classically, an operating system will store a reference to a file descriptor (an integer) for each open file, as well as a structure describing each file the Operating System is currently using. This structure will save information such as the mode of file and offset pointers that indicate where subsequent reads or writes should take place. Each user-level process will The user side will reference a file by its file descriptor and manipulate the file via the user level interface described below.
Note that the file system-related system calls are abstraction layers and should not care about the format of the underlying file system. That is, consider what must occur when an operating system has mounted multiple file systems of different types. When there is a call to read(2) for a particular file descriptor, the operating system will determine on which file system variant the file lives (e.g., FAT32, ext3, etc.) and then call the appropriate file system dependent function to perform the read operation (which usually is provided by module).
You must also reserve the standard file descriptors that relate to STDIN/STDOUT/STDERR as to not complicate later integrations. These will receive descriptors 0/1/2.
● k_read(int fd, int n, char *buf)
read n bytes from the file referenced by fd. On return, k_read returns the number of bytes read, 0 if EOF is reached, or a negative number on error. Note This function requires some thought, and will need to get the first block number of the file, navigate to it using lseek(2), and read the appropriate number of bytes, possible traversing multiple blocks.
Note Note that it is possible that a new block will be needed, in which case a fresh block allocation will have to occur and an update to the FAT will be necessary.
● k_unlink(const char *fname)
remove the file. Be careful how you implement this, like Linux, you should not be able to delete a file that is in use by another process. Furthermore, consider where updates will be necessary. You do not necessarily need to clear the previous data in the data region, but should at least note this area as ‘nullified’ or fresh and ready to write to, elsewhere.
● k_lseek(int fd, int offset, int whence)
reposition the file pointer for fd to the offset relative to whence. You must also implement the constants F_SEEK_SET, F_SEEK_CUR, and F_SEEK_END, which reference similar file whences as their similarly named counterparts in lseek(2). Note that this could require updates to the metadata of the file, for example, if the new position of n exceeds the files previous filesize!
List the file filename in the current directory. If filename is NULL, list all files in the current directory. This should act as very similar to posix, displaying the first block of the file, itspermissions (you can leave this for the time being, chmod will be required to be implemented later), size, latest modification timestamp and filename.
The first three routines are special routines to make your filesystem binary, mount the binary tomemory, and unmount it from memory. These functions can be implemented with UNIX’s standard system calls. These functions should print error messages and reprompt if anyunderlying C system calls fail.
The following set of routines can ONLY be executed if a filesystem is currently mounted inmemory. Also, you may only use your implemented kernel-level system calls (k_functions) toimplement these unless specified below.
If OUTPUT_FILE does not exist, it will be created. (Same for OUTPUT_FILE in thecommands below.)
● chmod Similar to chmod(1) in the VM.
● ls
Note that you should be using k_functions to implement each command’s functionality, unless you are interacting with the host OS. This means that if you are using C system calls such as open(2), read(2), write(2) in your implementation of cat, you are doing something wrong. You should be using k_open, k_read, and k_write instead. The functionality you implement for k_functions should make the implementations of command simple.
For example, the cp SOURCE -h DEST command, when specified with the -h flag, you willhave to read the contents of a file from the PennFAT filesystem, and write to a file in the host OS.
You must use k_read to read from the PennFAT filesystem, but may use write(2) to write to afile in the host OS.
The purpose of the pennfat program is to test your file system’s functionality, independent ofyour PennOS. The functionality of both kernel-level functions and PennFAT commands will benecessary for Milestone 1.
PennOS system. As a starting point, FS_NOT_MOUNTED and FILE_NOT_FOUND may be useful.You should set these errors and return values such that errors can later be propagated up the call stack.
Note there are multiple ways you can go about implementing this. But the desired functionality is that every time a process opens a file, that should be recorded in the system-wide global FD table and also somehow be taken care of by that process itself. Also, if the file corresponding to aparticular file descriptor is already open with F_WRITE or F_APPEND permissions, no otherprocess should be able to open the same file with F_WRITE or F_APPEND permissions.
Finally, even if multiple processes have the same file open under F_READ mode through callingthe open function multiple times (which is allowed), they will have unique offset values. Thismeans multiple files can read the same file concurrently and independently of each other.
This open will initially be done at the kernel level, using a more intricate andalready-implemented kernel level function. If the kernel level function succeeds andreturns a fd, the user level function should also somehow keep track that such filedescriptor is managed by the calling process. This can be done in multiple ways.
write n bytes of the string referenced by str to the file fd and increment the file pointer by n. On return, s_write returns the number of bytes written, or a negative value on error.
Note that this writes bytes not chars, these can be anything, even '\0' A kernel level write should occur to perform the actual functionality, but its important to remember that the process’ file descriptor may need to be updated as the position of the file pointer changes.
List the file filename in the current directory. If filename is NULL, list all files in the current directory. Before EC implementations, this should be very simple and could literally be a call a similar k_function.
Feel free to implement any other system calls you need. The purpose of system calls should be to abstract out the kernel layer to the user level shell commands. Similar to what you did in the standalone pennFAT, user level shell commands you will implement in the next section, such as cat and mv should ONLY be implemented with these system calls you implement here. Also, always do your best to modularize your code. s_open and s_read should call both kernel-level subroutines and other helper functions, and not be a big block of code.
Once PennOS is booted (i.e. executed from the host OS), it will eventually create a process to run a shell (likely indirectly through INIT creating the process). You will program this shell using the PennOS user interfaces described above. Although there is not strong separation between user and kernel land, you may only use the user level functions to program your shell and programs that run in it. That is, you may only use functions prefixed with a u_ or s_.
The shell is like any other spthread process running in PennOS and should be scheduled as such, except that it will always have a priority level of 0. Unlike traditional shells, it is not capable of running arbitrary programs; instead you will provide built-in programs to execute within a spthread. Below are the following features your shell should provide:● Synchronous Child Waiting: PennOS does not provide a means to perform asynchronous signal handling; instead, you will use a synchronous signal handler. Before prompting for the next command, your shell will attempt to wait on all children using s_waitpid.