Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
INF2C-CS Coursework 1
Important Dates
Overview
This is the first of two assignments for the INF2C Computer Systems course. It is worth 50% of the coursework mark for INF2C-CS and 20% of the overall course mark.
Learning Outcomes
This coursework is closely aligned with the learning outcomes of the INF2C-CS course as specified in its course descriptor (http://www.drps.ed.ac.uk/24-25/dpt/cxinfr08027.htm).
On completion of this course, the student will be able to:
- Describe the trade-offs in different binary representation systems.
- Explain the principles of: instruction set architecture, digital logic design, cache hierarchy, virtual memory, I/O devices, exceptions and processor management.
- Demonstrate an understanding of how a high-level programming language (C) maps to the assembly code by converting a simple C program to MIPS assembly.
- Sketch the design of a simple single- and multi-cycle processor and explain how it operates by combining the knowledge of the logic design basics with that of the MIPS instruction set architecture.
Task Descriptions
Task 1: String Comparison
If it doesn’t find any differing characters but one string ends before the other, that string is considered the lesser of the two. We’ll see how this works with an example.
strcmp compares the corresponding characters from both strings based on their ASCII values, which are numeric codes that represent each character.
Therefore, if the ASCII value of the first differing character in the first string is less than the ASCII value of the corresponding character in the second string, strcmp returns a negative number. This indicates that the first string comes before the second string in the dictionary.
If the ASCII value of the first differing character in the first string is greater than the ASCII value of the corresponding character in the second string, strcmp returns a positive number. This indicates that the first string comes after the second string in the dictionary.
The prototype of the strcmp function looks like:
int strcmp (const char * s1, const char * s2);
#include <stdio.h>#include <string.h>void compare(const char* a, const char* b) {
printf("strcmp(\"%s\", \"%s\") = %d\n", a, b, strcmp(a, b));
}int main() {
compare("abc", "abc");compare("abc", "ab");compare("ab", "abc");compare("abc", "bcd");compare("bc", "abc");return 0;
}
strcmp("abc", "abc") = 0strcmp("abc", "ab") = 1 strcmp("ab", "abc") = -1strcmp("abc", "bcd") = -1strcmp("bc", "abc") = 1
Character strings in C are represented as null-terminated byte strings (NTBS). A NTBS is a sequence of nonzero bytes followed by a byte with value zero (the terminating null character). Each byte in a byte string encodes one character of some character set. For example, the character array {'\x63','\x61','\x74','\0'} is an NTBS holding the string "cat" in ASCII encoding.
Let’s craft our own version of the strcmp function without using any built-in libraries to get a better grip on its inner workings. A possible unoptimised implementation of the my_strcmp function in C could look like this:
int result = 0;/*Search for the first differing character until we reach the end of oneof the strings. We don't need to check *s2 != '\0` since the first conditionfor the loop to continue is that *s1 equals *s2. If *s1 equals *s2 and *s1isn't '\0`, that means *s2 isn't '\0` either*/while (*s1 == *s2 && *s1 != '\0') {
s1++;s2++;
}/* if both strings have ended, they are equal */if (*s1 == '\0' && *s2 == '\0') {return 0;}/* here, we compare the first differing character */if (*s1 > *s2) {return 1;} else {return -1;}
To this end, a C version of the desired MIPS program is provided in the file comparator.c.
After compiling the comparator.c program, you can invoke the program like this
Try this out with strings different to “this” and “that”!
For convenience you can also use the provided Makefile, i.e. a build script using the Make build system, for building and running your code.❯ make run
HOW TO GET STARTED
For convenience, the C program includes definitions of functions such as readString and printChar, which mimic the behaviour of MARS system calls with the same names. Derive your MIPS program from this C program.
You are given a MIPS file named comparator.s which contains a skeleton of your program. This skeleton takes two command line arguments (like the comparator.c program), invokes the my_strcmp function from the compare function and prints out the numerical result of the string comparison to the screen.
YOUR TASK
You can the code from the command line using the Make build system like this:
As part of your code development and debugging process you might also want to use the MARS IDE, possibly single stepping over your code and observe its behaviour.
For this task you will have to complete MIPS assembly functions marked with “TODO”! Make sure that the output format of your MIPS string comparison program follows the example shown above, and is identical to what the provided C program shows.
Task 2: Tokenizer
Let’s have a look at what the tokenizer will do. Go to the C directory in the task2 folder, where you find an implementation written in C. We can build and run our tokenizer like this:
After compiling the tokenizer.c file and a couple of auxiliary libraries, libio.c and libstring.c, which contain helper functions for I/O and string handling, the tokenizer application is launched with a single long argument string. The application prints the input string, followed by each token (word). Note that separating spaces (‘ ‘) and other punctuation have been eliminated!
For this coursework the symbols acting as separators for words are: Space and ()[].,;:”# , any digits, and also the special string null terminator \000.
YOUR TASK
When launched with make you should see the same output for your complete tokenizer as for the provided C implementation:
For this task you should reuse your string comparison function from task 1. Copy your code of the function body my_strcmp into the strcmp function in the string handling library libstring.s.
For this task you will have to complete MIPS assembly functions marked with “TODO”! As before, make sure that your MIPS program follows exactly the output format of the provided C program.
Task 3: Spell Checker
Our INF2C-CS spell checker reads in a dictionary from a file (words.txt), builds an internal data structure to store the dictionary, before it reads in a text to be spell checked from another file (e.g. exampletext.txt), which is also stored word-by-word in an internal data structure. It then checks each word of the input against the dictionary, providing output indicating whether the word is contained in the dictionary or not.
Given the provided dictionary and the same input text as in task 2, we would run the spell checker like this and expect the following output:
“Ok” here means that the spell checker has found the word in the dictionary, whereas “not ok” means the word might have been misspelled as it doesn’t exist in the dictionary. Please note that the provided dictionary is very small and does not contain many words even if they are commonly used in many English texts. For example, the word “grammar” as the first word of the example input is marked as “not ok” as it’s not contained in the dictionary.
For this task you first need to write the spellchecker in C. There is a framework provided in the C directory of task 3. You will need to implement both data structures and functions for loading the dictionary, the input text, and the spellchecker itself. Auxiliary string handling functions e.g. from tasks 1 and 2 can and should be reused! Functionality is distributed over a number of files, which contain functions concerned with a single aspect of the project (loading of dictionary, loading of input, spell checking).
Once you have completed the C spellchecker, you will need to translate it statement by statement into MIPS assembly. There is a framework for you provided for task 3, which you should be using. The framework provides file and function structures for your C and MIPS projects. This means that your MIPS code is not contained in a single file, but will be placed in different files in line with the C code.
Make sure your spellchecker works on the provided test cases, and produces output in the format shown above!
- You might want to use the provided libraries (libstring, libio). You can extend the libraries with additional functions if you need them,
- You might want to include some amount of error checking and handling, e.g. catching non existing input files. You can use the error function in error.c for error reporting, which canbe suitable extended.
- Test your code with several inputs, including corner cases. The repository contains a number of test cases, which you should use for testing.
OUTPUT FORMAT
As we will use an automated checking framework it is important that you use a standardised output format for your spellchecker. Each line should contain a single word of the input string to be checked, followed by a “ - “ separator and then either “ok” or “not ok” as shown in the example output above.
The template for the output is: WORD - [ok|not ok]
Assumptions and Restrictions on Program Inputs
- A dictionary file will contain a maximum of 10,000 words.
- A single word in the dictionary file will contain a maximum of 20 alphabetic characters.
- An input file will be a maximum of 2048 characters including spaces and without any newlinecharacter.
- An input file may contain alphabetic (a-z, A-Z), punctuation marks (,.!?) and space (’ ’) characters.
Restrictions on the use of C Library Functions
The only C library functions that can be used are basic I/O functions like scanf and printf, and basic file handling, which are called within the functions printChar, printInt, printString and readString etc. provided with the C files. Don’t use C library functions directly in your code - there is not MIPS equivalent.
Instead, the provided libraries (libstring, libio, libmem) contain useful functions, which you can use. In fact, for tasks 2 and 3 you can reuse code from the previous tasks and add e.g. your string comparison function to the string handling library.
How to get support
The INF2C-CS Piazza Forum is a great place to find support. If you have any questions about the assignment, please start by checking existing discussions on Piazza. If you can’t find the answer to your question, start a new discussion – chances are, others have already encountered (and, possibly, solved) the same problem. The TA and the course instructor will also monitor Piazza and contribute to the discussions.
Please be reminded that academic misconduct regulations also apply to Piazza, so you should not post coursework solutions or code snippets publicly. If in doubt, make your question private,and an instructor will check if the post is acceptable.
Academic Integrity and the Use of ChatGPT
https://web.inf.ed.ac.uk/infweb/admin/policies/academic-misconduct
This also has links to the relevant University pages.
As mentioned, your code will be subject to further automatic and manual review after the deadline. Submitted code will be checked for similarity with other submissions using the MOSS system. MOSS has been effective at finding similarities in the past and it is not fooled by name changes or reordering of code blocks. Courseworks are individual, and we expect everyone to submit their sole, independent work.
Extensions or Extra Time Adjustments (ETA) are permitted up to a maximum of six days, but cannot be combined. If assessed coursework is submitted late without an approved extension or ETA, a penalty of 5% per calendar day will be applied, up to a maximum of six days after the original deadline, after which a mark of zero will be given.
While you must work on this coursework on your own, you can discuss general ideas with other students without sharing actual code or solutions. You are not allowed to use ChatGPT or other Generative AI tools for this coursework (even if this might be tempting and chances are that ChatGPT would be able to create a solution). Instead, if you need help with the coursework, please make use of the support provided to you! It is important for your learning that you develop the knowledge, skills and understanding associated with the learning outcomes of this coursework!
Marking Criteria
In both C and MIPS programming, commenting a program and keeping it tidy is very important. Make sure that you comment the code throughout and format it neatly.
For tasks 1 and 2, you will be graded only for your MIPS programs. For task 3, you will be graded for your C and MIPS programs.
- Task 1 contributes 15% of this coursework mark
- Task 2 contributes 25% of this coursework mark
- Task 3 contributes 35% of this coursework mark
- Bonus tasks 1 and 2 contribute together 25% of the coursework mark
This means with tasks 1 and 2 together (assuming you get full marks for both of them), you can get a “pass mark” of 40% for this coursework. By completing tasks 1-3 (again, assuming you get full marks for all of these), you can achieve an overall mark of up to 75% (“A”).
Bonus Marks
There are two additional bonus tasks that you can approach for bonus marks. These extra tasks are hard and/or require significant development effort. Attempt these only if you have plenty of spare time! You don’t need to do any of these to achieve an “A” - this is for bonus marks only!
Have a look here for inspiration: https://ladal.edu.au/lexsim.htmlWhen attempting this bonus task, provide your implementations in C/MIPS in the bonus1 folder, and make sure that your spellchecker provides output in this format:
WORD - not ok - ANOTHER WORD
2. Can you asymptotically improve the runtime performance of your spellchecker? The MARS IDE has an integrated dynamic instruction counter, which you can use to evaluate the number of instructions needed to execute a program:
You can also access the instruction counter from the command line by adding the MARS ic command line option.
When attempting this bonus task, provide your implementations in C/MIPS in the bonus2 folder, and provide a README file containing information describing your solution and the instruction counts of your baseline and improved code versions.