CSCI 2122 Assignment 4


Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due


CSCI 2122 Assignment 4

Due date: 11:59pm, Sunday, March 30, 2025, submitted via git

Objectives

The purpose of this assignment is to practice your coding in C, and to reinforce the concepts discussed in class on program representation.

In this assignment you will implement a CPU simulator for a simple instruction set (much simpler than x86). This assignment is divided into several parts to make it simpler. In the first part, you will implement the memory part of the simulator and loader, which loads the code from a file into the simulated memory.

In the second part, you will implement the fetch-decode-execute part of the CPU. 

Preparation:

1. Complete Assignment 0 or ensure that the tools you would need to complete it are installed.
2. Clone your assignment repository: https://git.cs.dal.ca/courses/2025-Winter/csci-2122/assignment-4/????.git where ???? is your CSID. Please see instructions in Assignment 0 and the tutorials on Brightspace if you are not sure how.

Inside the repository there are two directories: xsim1 and xsim2, where code is to be written. You should set up a separate CLion project for each of these directories, like the labs. Inside each directory is a tests directory that contains tests that will be executed each time you submit your code. Please do not modify the tests directory or the .gitlab-ci.yml file that is found in the root directory. Mod ifying these files may break the tests. These files will be replaced with originals when the assignments are graded. You are provided with sample Makefile files that can be used to build your program. If you are using CLion, a Makefile will be generated from the CMakeLists.txt file generated by CLion.

Background:

For this assignment you will implement a simplified RISC-based 16-bit CPU simulator. Specifically, the CPU has 16 general purpose registers and a small number (approximately 35) instructions, most of which are two bytes (one word) in size. As well, there are a couple special purpose registers, such as the program counter (PC) and the status (F) register.

The X Architecture comprises a 64KB (65536 bytes) memory space with addresses ranging from 0 to 65535, and a CPU with 16 general purpose 16-bit registers (r0 . . . r15 ). The CPU is a 16-bit CPU, meaning that the word size is 16-bits (2 bytes). Thus, nearly all operations operate on 16-bit chunks of data, i.e., in this case one word is 16 bits. Thus, all values and addresses are 16 bits in size. All 16-bit values are also encoded in big-endian format, meaning that the most-significant byte comes first.

Apart from the 16 general purpose registers, the CPU has two special 16-bit registers: a program counter (PC), which stores the address of the next instruction that will be executed, and the status (F), which stores bit-flags representing the state of the CPU. The status register contains two flags. The least significant bit is the condition flag, which represents the truth value of the last logical test operation. The bit is set to true if the condition was true, and to false otherwise. The second least significant bit indicates whether the debug mode is on or off. If the debug mode is on, then after each instruction is executed, the CPU simulator prints out the state of all its registers (using the provided function).

Additionally, the CPU uses the last general-purpose register, r15, to store the pointer to the program stack. This register is incremented by two when an item is popped off the stack and decremented by two when an item is pushed on the stack.

The size of the program stack is specified by the code being executed. The program stack is used to store temporary values, arguments to a function, and the return address of a function call.

The CPU takes one clock tick to execute a Fetch, Decode, Execute cycle for any instruction:

1. The instruction is fetched from memory using the PC to index into the memory space.
2. The PC is incremented by two.
3. The instruction is decoded.
4. The instruction is executed.
5. Lastly, if the debug mode is turned on, the CPU state is displayed using the xcpu_print() function found in xcpuprnt.c.
The CPU has the following instruction set.

The X Instruction Set

The instruction set comprises approximately 35 instructions that perform arithmetic and logic, data movement, stack manipulation, and flow control. Most instructions take registers as their operands stack, and store the result of the operation in a register. However, some instructions also take immediate values as operands. Thus, there are four classes of instructions: 0-operand instructions, 1-operand instructions, 2- operand instructions, and extended instructions, which take two words (4 bytes) instead of one word.

All but the extended instructions are encoded as a single word (16 bits). The extended instructions are also one word but are followed by an additional one-word operand. Thus, if the instruction is an extended instruction, the PC needs an additional increment 2 during the instruction’s execution. As mentioned previously, most instructions are encoded as a single word. The most significant two bits of the word indicates whether the instruction is a 0-operand instruction (00), a 1-operand instruction (01), a 2-operand instruction (10), or an extended instruction (11). For a 0-operand instruction, the encoding is 

where the two most significant bits are 00 and the next six bits represent the instruction identifier. The second byte of the instruction is 0.

For a 1-operand instruction, the encoding is


where the two most significant bits are 01, the next bit indicates whether the operand is an immediate or a register, and the next five bits represent the instruction identifier. If the third most significant bit is 0, then the four most significant bits of the second byte encode the register that is to be operated on (0… 15). Otherwise, if the third most significant bit is 1, then the second byte encodes the immediate value.

For a 2-operand instruction, the encoding is


where the two most significant bits are 10, and the next six bits represent the instruction identifier. The second byte encodes the two register operands in two four-bit chunks. Each of the 4-bit chunks identifies one of the 16 registers (0 … 15).

For an extended instruction the encoding is


where the two most significant bits are 11, the next bit indicates whether a second register operand is used, and the next five bits represent the instruction identifier. If the third most significant bit is 0, then the instruction only uses the one-word immediate operand that follows the instruction. Otherwise, if the third most significant bit is 1, then the four most significant bits of the second byte encode (1 … 15) the register that is the second operand of the instruction.

The instruction set is described in Tables 1, 2, 3, and 4. Each description includes the mnemonic (and syntax), the encoding of the instruction, the instruction’s description and function. For example, the add, loadi, and push instructions have the following descriptions:

Mnemonic
Encoding
Description
Function
add rS, rD
10000001 S D
Add register rS to register rD.
rD ← rD + rS
loadi V, rD
11100001 D 0
Load immediate value or address V into register rD.
rD ← memory[PC]
PC ← PC + 2
push rS
01000011 S 0
Push register rS onto program stack.
r15 ← r15 - 2
memory[r15 ] ← rS

First, observe that the add instruction takes two register operands and adds the first register to the second. All 2-operand instructions operate only on registers and the second register is both a source and destination, while the first is the source. It is 2-operand instruction, hence the first two bits are 10, its instruction identifier is 000001 hence the first byte of the instruction is 0x81.

Second, the loadi instruction is an extended instruction that takes a 16-bit immediate and stores it in a register. Hence, the the first two bits are 11, the register bit is set to 1, and the instruction identifier is 00001. Hence, the first byte is encoded as 0xE1.

Third, the push instruction is a 1-operand instruction, taking a single register operand. Hence, the first two bits are 01, the immediate bit is 0, and the instruction identifier is 00011. Hence, the first byte is encoded as 0x43.

Note that S and D are 4-bit vectors representing S and D.

Table 1: 0-Operand Instructions

Mnemonic
Encoding
Description
Function
ret
00000001 0
Return from a procedure call.
P C ← memory[r15 ]
r15 ← r15 + 2
cld
00000010 0
Clear the debug bit.
F ← F &0xFFFD
std
00000011 S 0
Set the debug bit.
F ← F | 0x0002

Table 2: 2-Operand Instructions

Mnemonic
Encoding
Description
Function
add rS , rD
10000001 S D
Add register rS to register rD .
rD ← rD + rS
sub rS , rD
10000010 S D
Subtract register rS from regis- ter rD.
rD ← rD - rS
mul rS , rD
10000011 S D
Multiply register rD by register rS.
rD ← rD * rS
div rS , rD
10000100 S D
Divide register rD by register rS .
rD ← rD / rS
and rS , rD
10000101 S D
And register rS with register rD .
rD ← rD & rS
or rS , rD
10000110 S D
Or register rS with register rD .
rD ← rD | rS
xor rS , rD
10000111 S D
Xor register rS with register rD .
rD ← rD ^ rS
shr rS , rD
10001000 S D
Shift right register rD by register rS.
rD ← rD >> rS
shl rS , rD
10001001 S D
Shift left register rD by register rS.
rD ← rD << rS
test rS1, rS2
10001010 S D
Set condition flag to true if and only if rS1 ∧ rS2 is not 0.
if rS 1 & rS2 != 0: F ← F | 0x0001
else:
F ← F & 0xFFFE
cmp rS 1 , rS 2
10001011 S D
Set condition flag to true if and only If rS 1 < rS 2 .
if rS 1 < rS 2 :
F ← F | 0x0001
else:
F ← F & 0xFFFE
equ rS1, rS2
10001100 S D
Set condition flag to true if and only if rS1 == rS 2 .
if rS 1 == rS 2 :
F ← F | 0x0001
else:
F ← F & 0xFFFE
mov rS , rD
10001101 S D
Copy register rS to register rD .
rD ← rS
load rS , rD
10001110 S D
Load word into register rD from memory pointed to by register rS.
rD ← memory[rS]
stor rS , rD
10001111 S D
Store word from register rS to memory at address in register rD.
memory[rD] ← rS
loadb rS , rD
10010000 S D
Load byte into register rD from memory pointed to by register rS.
rD ← (byte)memory[rS]
storb rS , rD
10010001 S D
Store byte from register rS to memory at address in register rD.
(byte)memory[rD] ← rS

Table 1: Extended Instructions

Mnemonic
Encoding
Description
Function
jmp L
11000001 0
Absolute jump to label L.
PC ← memory[PC]
call L
11000010 0
Absolute call to label L..
r15 ← r15 – 2
memory[r15] ← PC + 2
PC ← memory[PC]
loadi V, rD
11100001 D 0
Load immediate value or address V into register rD.
rD ← memory[PC]
PC ← PC + 2

Note that in the case of extended instructions, the label L or value V are encoded as a single word (16-bit value) following the word containing the instruction. The 0 in the encodings above represents a 4-bit 0 vector.

Table 2: 1-Operand Instructions

Mnemonic
Encoding
Description
Function
neg rD
01000001 D 0 
Negate register rD .
rD ← −rD
noy rD
01000010 D 0
 Logically negate register rD .
rD ←!rD
inc rD
01001000 D 0
Increment rD .
rD ← rD + 1
dec rD
01001001 D 0
Decrement rD .
rD ← rD – 1
push rS
01000011 S 0
Push register rS onto the pro- gram stack.
r15 ← r15 – 2
memory[r15] ← rS
pop rD
01000100 D 0
Pop value from stack into regis- ter rD.
rD ← memory[r15 ]
r15 ← r15 + 2
jmpr rS
01000101 S 0
Jump to address in register rS .
PC ← rS
callR rS
01000110 S 0
Call to address in register rS.
r15 ← r15 – 2
memory[r15] ← PC
PC ← rS
out rS
01000111 S 0
Output character in rS to std- out.
output ← rS
br L
01100001 L
Branch relative to label L if condition bit is true.
if F & 0x0001 == 0x001:
PC ← PC + L – 2
jr L
01100010 L
Jump relative to label L.
PC ← PC + L – 2

An Assembler for this Instruction Set

An assembler is provided for you to use (if needed). The assembler manual is at the end of the assignment.

The X Memory Specification

The X Memory module (xmem.c) manages the memory (RAM) that is read from and written to by the X CPU, as it executes each instruction. Hence, the module has a very simple function. (i) It allocates memory of a specified size for the simulation’s memory space when the module is initialized, and stores a pointer to the memory. (ii) It provides a “store” operation that takes a pointer to a word of data and an index (address) in the allocated memory space, and stores the word of at the specified address. (iii) It provides a “load” operation that takes an index (address) in the allocated memory space and a pointer to a desti nation, and loads a word of data from the specified address in the memory space into the destination. I.e., this module simulates the memory system of a computer. Note: a word is 16-bits (2 bytes).

The X Simulator Specification

An X-CPU simulation is conducted in the following manner. The simulator initializes the memory module, which allocates a 64KB memory space. The simulator loads the specified program from a file and stores it in memory using the memory module’s “store” operation. Once a program is loaded into memory, the CPU context is initialized: All registers should be zeroed, including the PC because this is the point in memory where the CPU begins execution. Once the CPU context is initialized the program is executed.

An execution of the program takes place by repeatedly calling the CPU execution function, which fetches, decodes, and executes the next instruction pointed to by the PC. The execution proceeds for a specified number of ticks or until the program halts the CPU. A program halts the CPU when it executes an illegal instruction, such as the opcode 0. When this happens, the simulator exits.

Your task will be to implement the X Simulator and the X-CPU, which is used to execute programs as sembled with the provided assembler.

Task 1: Implement the main() for the Simulator

Your first task is to implement the main() function of the X Simulator in the xsim1 directory of your repo. The source file main.c should contain the main() function. The simulator should do the following:


  1. Take two (2) arguments on the command line: The first parameter is a nonnegative integer denoting the maximum number of cycles that a program should run for, and the remaining argument is the object/executable file of the program to run. For example, the invocation ./xsim 1000 hello.xo instructs the simulator to load the program hello.xo and run the program for at most 1000 cycles. If 0 cycles is specified this is interpreted as an infinite number of cycles. I.e., the program should never run out of cycles.
  2. Allocate or instantiate an xcpu structure (defined in xcpu.h).
  3. Initialize the X Memory module by calling xmem_init(65536) (defined in xmem.c), thus al locating a 64KB memory space.
  4. Load the program into memory from the file specified on the command-line. The program should be loaded starting at the start of the memory space (address 0) using the xmem_store() function, declared in xmem.h. The program will need to open, read, and close the file.
  5. Initialize the xcpu structure.
  6. Execute the loaded program for the specified number of cycles or until the program halts. Your code should use a loop and call the xcpu_execute() function (declared in xcpu.h) that you will implement in the next part of this assignment. For Task 1, an implementation in xcpu.o is provided.
  7. Terminate the simulator once the program halts or runs out of cycles.


Note: the sample solution is less than 80 lines of code. (But there is no limit on the length.) J

Input

The input to the program is via the command line. The program takes two arguments
• cycles: the maximum number of instructions to execute. If the value is 0, the execution is unlimited.
• program_file: the program file containing the assembled code to be executed.

Processing

All input shall be correct. All program files shall at most 65536 bytes (64KB). The simulator will need to open and read the program file and store it in the memory module. There is no need to interpret the program code in this part of the assignment.

The last part of the simulation involves calling the xcpu_execute() function, in a loop. Each call exe cutes one instruction. This function returns 1 on success and 0 on failure. The loop should continue until the specified number of instructions have been executed, or until the function returns 0. This function is implemented in xcpu.o and will be the focus of the next part of the assignment.

Recommendation: While no error checking is required, it may be helpful to still do error checking, e.g., make sure files are properly opened because it will help with debugging as well.

Output

There is no required output for this part of the assignment. The provided code will do the output.

Task 2: Implement xcpu_execute()

Your second task is to implement the X-CPU in the file xcpu.c. in the xsim2 directory of your repo.

First, copy the main.c from xsim1 to xsim2. You are now to implement the function: extern int xcpu_execute( xcpu *c );

which is declared in xcpu.h and simulates one cycle of a CPU. The function takes a pointer to the xcpu struct:

typedef struct xcpu_context {

unsigned short regs[X_MAX_REGS]; /* general register file */
unsigned short pc; /* program counter */
unsigned short state; /* state (F) register */
} xcpu;

which stores the context of a program. The structure is defined in xcpu.h and contains CPU registers.

The function should

  1. Fetch the current instruction pointed to by the PC from the xmem memory (use xmem_load()),
  2. Increment the PC, decode the instruction, and
  3. Execute the instruction, modifying the PC, other registers, or memory, depending on the instruction.
  4. If the debugging bit is turned on in the F (state) register (see std and cld instructions in Table 1) call the xcpu_print() function (declared in xcpu.h) to display the CPU registers.
  5. Return 1 if the instruction was successfully executed, and 0 if an illegal instruction was executed.

Note: the sample solution is less than 200 lines of code. The opcodes for all the instructions can be found in xis.h, so you should not need to do a lot of work to get this working.

Input

The input is the same as Task 1.

Processing

Use the information in Tables 1 – 4 to implement the function xcpu_execute(). No error checking is needed. Use the xmem_load() and xmem_store() functions (declared in xmem.h) to read and write to the X memory.

For the out instruction (see Table 4), output the lower byte of the operand using the standard library function putchar().

Output

Almost all output is done using the xcpu_print() function, which is provided. The only other output occurs when the out instruction is executed. In this case, output the single character stored in the lower byte of the operand, treating it as an ASCII value. (Use putchar().)

Hints and Suggestions

  • Use the unsigned short type for all registers and indices.
  • You should only need to modify two files: main.c (Task 1) and xcpu.c (Task 2).
  • Start early, the solution totals less than 250 lines of (additional) code, but there is a lot to digest in the assignment specifications.
  • The library functions I found most useful are: sscanf(), printf(), open(), and read(), close().

Assignment Submission

Submission and testing are done using Git, Gitlab, and Gitlab CI/CD. You can submit as many times as you wish, up to the deadline. Every time a submission occurs, functional tests are executed, and you can view the results of the tests. To submit use the same procedure as Assignment 0.

Grading

If your program does not compile, it is considered non-functional and of extremely poor quality, meaning you will receive 0 for the solution.

The assignment will be graded based on three criteria:

Functionality: “Does it work according to specifications?”. This is determined in an automated fashion by running your program on several inputs and ensuring that the outputs match the expected outputs. The score is determined based on the number of tests that your program passes. So, if your program passes t/T tests, you will receive that proportion of the marks.

Quality of Solution: “Is it a good solution?” This considers whether the approach and algorithm in your solution is correct. This is determined by visual inspection of the code. It is possible to get a good grade on this part even if you have bugs that cause your code to fail some of the tests.

Code Clarity: “Is it well written?” This considers whether the solution is properly formatted, well docu mented, and follows coding style guidelines. A single overall mark will be assigned for clarity. Please see the Style Guide in the Assignment section of the course in Brightspace.

The following grading scheme will be used:

Task
100%
80%
60%
40%
20%
0%
Functionality (20 marks)
Equal to the number of tests passed.
Solution Quality Task 1 (5 marks)
Implemented efficiently and correctly.
Implemented correctly but inefficiently.
Some minor bugs.
Major flaws in implementation
An attempt has been made.
No code submitted or code does not compile
Solution Quality Task 2 (15 marks)
Implemented efficiently and corectly.
Implemented correctly but inefficiently.
Some minor bugs.
Major flaws in implementation
An attempt has been made
Code Clarity (10 marks) Indentation, formatting, naming, comments
Code looks professional and follows all style guidelines
Code looks good and mostly follows style guidelines.
Code is mostly readable and mostly follows some of the style guidelines
Code is hard to read and follows few of the style guidelines
Code is not legible

Assignment Testing without Submission

Testing via submission can take some time, especially if the server is loaded. You can run the tests without submitting your code by using the provided runtests.sh script. Running the script with no arguments will run all the tests. Running the script with the test number, i.e., 00, 01, 02, 03, … 09, will run that specific test. Please see below for how run the script.

Get your program ready to run

If you are developing directly on the unix server,
1. SSH into the remote server and be sure you are in the marsdec1 or marsdecx directory.
2. Be sure the program is compiled by running make.
If you are using CLion
1. Run your program on the remote server as described in the CLion tutorials.
2. Open a remote host terminal via Tools → Open Remote Host Terminal
If you are using VSCode
1. Run your program on the remote server as described in VSCode tutorials.
2. Click on the Terminal pane in the bottom half of the window or via Terminal → New Terminal

Run the script

3. Run the script in the terminal by using the command:
./runtest.sh
to run all the tests, or specify the test number to run a specific test, e.g. :
./runtest.sh 07

You will see the test run in the terminal window.

The X Assembler (xas)

An assembler (xas) is provided to allow you to write and compile programs for the X-CPU. To make the assembler, simply run “make” in the same directory where you have unzipped the provided tarball. The format of the assembly files is simple.


  1. Anything to the right of a # mark, including the #, is considered a comment and ignored.
  2. Blank lines are ignored.
  3. Each line in the assembly file that is not blank must contains a directive, a label and/or an instruction. If the line contains both a label and an instruction, the label must come first.
  4. A label is of the form identifier: where identifier consists of any sequence of letters (A-Za-z), digits (0-9), or underscores ( ) as long the first character is not a digit. A colon (:) must terminate the label. A label represents the corre sponding location in the program and may be used to jump to that particular location in the code.
  5. An instruction consists of a mnemonic, such as add, loadi, push, etc, followed by zero or more operands. The operand must be separated from the mnemonic by one or more white spaces. Multiple operands are separated by a comma.
  6. If an operand is a register, then it must be in the form r# where # ranges between 0 and 15 inclu sively. E.g., r13. 
  7. If the instruction is an immediate, then the argument may either be a label, or an integer. Note: labels are case-sensitive. If a label is specified, no colon should follow the label.
  8. Directives instruct the assembler to perform a specific function or behave in a specific manner. All directives begin with a period and are followed by a keyword. There are three directives: .lit eral, .words and .glob, each of which takes an operand.


(a) The .literal directive encodes a null terminated string or an integer at the present location in the program. E.g., 

mystring:

.literal "Hello World!"
myvalue:
.literal 42
encodes a nil-terminated string followed by a 16-bit (1 word) value representing the dec imal value 42. In this example, there are labels preceding each of the encodings so that it is easy to access these literals. That is, the label mystring represents the address (relative to the start of the program) where the string is encoded, and the label myvalue repre sents the address (relative to the start of the program) of the value. This is used in the hello.xas example.
(b) The .words directive sets aside a specified number of words of memory at the present location in the program. E.g.,
myspace:
.words 6
allocates 6 words of memory at the present point in the program. This space is not initialized to any specific value. Just as before, the label preceding the directive repre-sents the address of the first word, relative to the start of the program. This is used in xrt0.xas to set aside space for the program stack.
(c) The .glob directive exports the specified symbol (label) if it is defined in the file, and imports the specified symbol (label) if it is used but not defined in the file. E.g.,
.glob foo
.glob bar
...
loadi bar, r0
...
foo:
.literal "Hello World!"
declares two symbols (labels) foo and bar. Symbol foo is declared in this file, so it will be exported (can be accessed) in other files. The latter symbol, bar, is used but not defined. When this file is linked, the linker looks for the symbol (label) definition in other files makes all references to the symbol refer to where it is defined.
Note: it is recommended that you place literals and all space allocations at the end of your program, so that they will not interfere with program itself. If you do place literals in the middle of your program, you will need to ensure that your code jumps around these allocations.

There are several example assembly files provided (ending in .xas).s You can assemble them by invoking the assembler, for example:

./xas example.xas example.xo
This invocation will cause the assembler to read in the file example.xas and generate an output file example.xo. Feel free to look at the code for the assembler.

发表评论

电子邮件地址不会被公开。 必填项已用*标注