Assignment 1 - Modify the Lexer

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

Assignment 1 - Modify the Lexer

Due Date: March 2, before midnight
Note that the due date for this assignment applies to the last commit timestamp in the main branch of your repository.

Overview

This assignment aims to extend the Lexer component of our X language compiler to handle additional tokens and supplement our understanding of compilers and Lexical Analysis.

You are provided with the Lexer code, written together in class, which will be automatically cloned into your GitHub repository when you begin the assignment via this assignment link (https://classroom.github.com/a/S3tv7dx_) . After cloning your assignment, PLEASE UPDATE THE README with your name!

Submission

Your assignment will be submitted using github. Only the “main” branch of your repository will be graded.

Your program will be tested using unit tests, manual compilation, and execution on the command line. I will test with many valid x programs (not just simple.x). To compile and execute, I will use the following steps (these work for a *nix shell and will not work in

Windows, though you should still be able to compile your application from the Windows command line):

# Compile and execute compiler tools - these commands assume a *nix shell; if you are on
# a Windows machine, update the paths to use the drive letter and backslashes
javac -d out -cp src:. src/tools/CompilerTools.java path/to/grammardefinition
java -cp out:. tools.CompilerTools
# Compile and execute the Lexer with one or more test X files
javac -d out -cp src:. src/lexer/Lexer.java
java -cp out:. lexer.Lexer path/to/test/file.x

Requirements

You will extend the Lexer to process additional tokens and improve its output.

Requirement 1

The current implementation of Lexer reads a hardcoded file. Lexer must be updated to take a path to a file as a command line argument: 

java lexer.Lexer sample_files/simple.x

If a filename is not supplied, a usage instruction should be displayed (in this snippet, the > character indicates the shell prompt and execution command; it is not a part of the required output):

> java lexer.Lexer
usage: java lexer.Lexer filename.x

Requirement 2

The Token class must be updated to include the source file's line number where a token was found.

Requirement 3

Lexer output must be updated for readability; the line number of the Token and the type of token created must be included. The updated format for a token line is:
1. 20 columns, left aligned, for the token's lexeme, followed by a space
2. "left:" followed by a space
3. 8 columns, left aligned, for the left position, followed by a space
4. "right:" followed by a space
5. 8 columns, left aligned, for the right position, followed by a space
6. "line:" followed by a space
7. 8 columns, left aligned, for the line number, followed by a space
8. The symbol
(Hint: Use String.format)

Requirement 4

The lexer must be updated to process additional tokens. The new tokens must be added to the grammar file, and the CompilerTools run to re-generate the SymbolTable class and the TokenKind enumeration. Remember from the lecture that we do not want to (must not) use regular expressions when adding lexer logic for the new tokens.

In our grammar definition file, replace the productions below:

Production Name
Previous Right Hand Rule
Updated Right Hand Rule
TYPE
'int' | 'bool'
'int' | 'bool' | 'char' | 'hex'
FACTOR
'(' REL_OP ')' | IDENTIFIER | CALL | <int>
'(' REL_OP ')' | IDENTIFIER | CALL | <int> | <char> | <hex>
REL_OP
ADD_OP | ADD_OP '==' ADD_OP | ADD_OP
'!=' ADD_OP | ADD_OP '<' ADD_OP |
ADD_OP '<=' ADD_OP
ADD_OP | ADD_OP '==' ADD_OP | ADD_OP '!='
ADD_OP | ADD_OP '<' ADD_OP | ADD_OP '<='
ADD_OP | ADD_OP '>' ADD_OP | ADD_OP '>='
ADD_OP
STATEMENT
IF | WHILE | RETURN | BLOCK |
ASSIGNMENT
IF | WHILE | RETURN | BLOCK | ASSIGNMENT |
FROM

In our grammar definition file, add the productions below:
FROM ::= 'from' REL_OP '=>' REL_OP 'step' REL_OP BLOCK | 'from' REL_OP '=>' REL_OP BLOCK

This introduces a "from" statement into the x language that will iterate from the first value to the next, either with an explicitly defined step or a step that defaults to 1.

Since we have added new separators and operators into the language, our grammar file must also be updated with their symbolic constants (To should be placed in the separators section of the file, and the Greater and GreaterEqual should be placed in the operators section of the file):

=> To
> Greater
>= GreaterEqual

A char is a double quote, followed by only one character, followed by a double quote:

Valid characters: "a" "0"
Invalid characters: "0a" "ab"

A hex is the character zero, followed by the character x, followed by one or more digits 0-9 and a-f. The characters may be either upper case or lower case.

Valid hex: 0x1a2F 0X123456789abcdef
Invalid hex: 0x 0x123g

Requirement 5

Lexer output must be updated to include a printout of each line, with its line number, read in from the source file. Line numbers should be printed in 3 columns, right-aligned. Note that when an error is encountered, the error should be reported as usual, and the lines of the source file should be output, with line numbers up to and including the error line. Think critically about where the implementation of source code output should go - which object in our system has access to the necessary information? (Hint: it isn't the Lexer)

Appendix A

The complete output for the source file simple.x (the indentation you see for the file output is to allow for three-digit line numbers):
> java lexer.Lexer sample_files/simple.x
program left: 0 right: 6 line: 1 Program
{ left: 8 right: 8 line: 1 LeftBrace
int left: 10 right: 12 line: 1 IntType
i left: 14 right: 14 line: 1 Identifier
int left: 16 right: 18 line: 1 IntType
j left: 20 right: 20 line: 1 Identifier
i left: 3 right: 3 line: 2 Identifier
= left: 5 right: 5 line: 2 Assign
i left: 7 right: 7 line: 2 Identifier
+ left: 9 right: 9 line: 2 Plus
j left: 11 right: 11 line: 2 Identifier
+ left: 13 right: 13 line: 2 Plus
7 left: 15 right: 15 line: 2 IntLit
j left: 3 right: 3 line: 3 Identifier
= left: 5 right: 5 line: 3 Assign
write left: 7 right: 11 line: 3 Identifier
( left: 12 right: 12 line: 3 LeftParen
i left: 13 right: 13 line: 3 Identifier
) left: 14 right: 14 line: 3 RightParen
} left: 0 right: 0 line: 4 RightBrace
1: program { int i int j
2: i = i + j + 7
3: j = write(i)
4: }

View Rubric
1-Lexer-Rubric 

Criteria
Ratings
Pts
[CODE QUALITY] The code is clean and wellformatted view longer description
7 pts
Excellent
5 pts
Good
4 pts
Fair
2 pts
Poor
1 pts
Very Poor
0 pts
No Marks
Not enoughcode waswritten to beable to  evaluate
/ 7 pts
[CODE QUALITY[ Naming (and only necessary comments!)
3 pts
Excellent
2 pts
Good


 
1 pts
Poor
Variables or method names do not adequately describe their purpose
0 pts
No Marks
Not enough code was written to be able to evaluate
/ 3 pts
[CODE QUALITY] Single responsibility principle
3 pts
Excellent
2 pts
Good




1 pts
Poor



0 pts
No Marks
Not enough code was written to be able to evaluate
/ 3 pts
[MANUAL TESTING] Lexer executed from the command line
24 pts
Full Marks




0 pts
No Marks
/ 24 pts
[UNIT TESTS] Unit tests executed against your lexer
60 pts
Full Marks




0 pts
No Marks
/ 60 pts
INCLUDE YOUR NAME
3 pts
Full Marks




0 pts
No Marks
/ 3 pts







Total Points: 0

发表评论

电子邮件地址不会被公开。 必填项已用*标注