首页 » 计算机科学(Computing Science) » Assignment 1 - Modify the Lexer

Assignment 1 - Modify the Lexer

2025-03-03 Admin 写评论

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

Assignment 1 - Modify the Lexer

Due Date: March 2, before midnight

Note that the due date for this assignment applies to the last commit timestamp in the main branch of your repository.

Overview

This assignment aims to extend the Lexer component of our X language compiler to handle additional tokens and supplement our understanding of compilers and Lexical Analysis.

You are provided with the Lexer code, written together in class, which will be automatically cloned into your GitHub repository when you begin the assignment via this assignment link (https://classroom.github.com/a/S3tv7dx_) . After cloning your assignment, PLEASE UPDATE THE README with your name!

Submission

Your assignment will be submitted using github. Only the “main” branch of your repository will be graded.

Your program will be tested using unit tests, manual compilation, and execution on the command line. I will test with many valid x programs (not just simple.x). To compile and execute, I will use the following steps (these work for a *nix shell and will not work in

Windows, though you should still be able to compile your application from the Windows command line):

# Compile and execute compiler tools - these commands assume a *nix shell; if you are on

# a Windows machine, update the paths to use the drive letter and backslashes

javac -d out -cp src:. src/tools/CompilerTools.java path/to/grammardefinition

java -cp out:. tools.CompilerTools

# Compile and execute the Lexer with one or more test X files

javac -d out -cp src:. src/lexer/Lexer.java

java -cp out:. lexer.Lexer path/to/test/file.x

Requirements

You will extend the Lexer to process additional tokens and improve its output.

Requirement 1

The current implementation of Lexer reads a hardcoded file. Lexer must be updated to take a path to a file as a command line argument:

java lexer.Lexer sample_files/simple.x

If a filename is not supplied, a usage instruction should be displayed (in this snippet, the > character indicates the shell prompt and execution command; it is not a part of the required output):

> java lexer.Lexer

usage: java lexer.Lexer filename.x

Requirement 2

The Token class must be updated to include the source file's line number where a token was found.

Requirement 3

Lexer output must be updated for readability; the line number of the Token and the type of token created must be included. The updated format for a token line is:

1. 20 columns, left aligned, for the token's lexeme, followed by a space

2. "left:" followed by a space

3. 8 columns, left aligned, for the left position, followed by a space

4. "right:" followed by a space

5. 8 columns, left aligned, for the right position, followed by a space

6. "line:" followed by a space

7. 8 columns, left aligned, for the line number, followed by a space

8. The symbol

(Hint: Use String.format)

Requirement 4

The lexer must be updated to process additional tokens. The new tokens must be added to the grammar file, and the CompilerTools run to re-generate the SymbolTable class and the TokenKind enumeration. Remember from the lecture that we do not want to (must not) use regular expressions when adding lexer logic for the new tokens.

In our grammar definition file, replace the productions below:

Production Name	Previous Right Hand Rule	Updated Right Hand Rule
TYPE	'int' \| 'bool'	'int' \| 'bool' \| 'char' \| 'hex'
FACTOR	'(' REL_OP ')' \| IDENTIFIER \| CALL \| <int>	'(' REL_OP ')' \| IDENTIFIER \| CALL \| <int> \| <char> \| <hex>
REL_OP	ADD_OP \| ADD_OP '==' ADD_OP \| ADD_OP '!=' ADD_OP \| ADD_OP '<' ADD_OP \| ADD_OP '<=' ADD_OP	ADD_OP \| ADD_OP '==' ADD_OP \| ADD_OP '!=' ADD_OP \| ADD_OP '<' ADD_OP \| ADD_OP '<=' ADD_OP \| ADD_OP '>' ADD_OP \| ADD_OP '>=' ADD_OP
STATEMENT	IF \| WHILE \| RETURN \| BLOCK \| ASSIGNMENT	IF \| WHILE \| RETURN \| BLOCK \| ASSIGNMENT \| FROM

In our grammar definition file, add the productions below:

FROM ::= 'from' REL_OP '=>' REL_OP 'step' REL_OP BLOCK | 'from' REL_OP '=>' REL_OP BLOCK

This introduces a "from" statement into the x language that will iterate from the first value to the next, either with an explicitly defined step or a step that defaults to 1.

Since we have added new separators and operators into the language, our grammar file must also be updated with their symbolic constants (To should be placed in the separators section of the file, and the Greater and GreaterEqual should be placed in the operators section of the file):

=> To

> Greater

>= GreaterEqual

A char is a double quote, followed by only one character, followed by a double quote:

Valid characters: "a" "0"

Invalid characters: "0a" "ab"

A hex is the character zero, followed by the character x, followed by one or more digits 0-9 and a-f. The characters may be either upper case or lower case.

Valid hex: 0x1a2F 0X123456789abcdef

Invalid hex: 0x 0x123g

Requirement 5

Lexer output must be updated to include a printout of each line, with its line number, read in from the source file. Line numbers should be printed in 3 columns, right-aligned. Note that when an error is encountered, the error should be reported as usual, and the lines of the source file should be output, with line numbers up to and including the error line. Think critically about where the implementation of source code output should go - which object in our system has access to the necessary information? (Hint: it isn't the Lexer)

Appendix A

The complete output for the source file simple.x (the indentation you see for the file output is to allow for three-digit line numbers):

> java lexer.Lexer sample_files/simple.x

program left: 0 right: 6 line: 1 Program

{ left: 8 right: 8 line: 1 LeftBrace

int left: 10 right: 12 line: 1 IntType

i left: 14 right: 14 line: 1 Identifier

int left: 16 right: 18 line: 1 IntType

j left: 20 right: 20 line: 1 Identifier

i left: 3 right: 3 line: 2 Identifier

= left: 5 right: 5 line: 2 Assign

i left: 7 right: 7 line: 2 Identifier

+ left: 9 right: 9 line: 2 Plus

j left: 11 right: 11 line: 2 Identifier

+ left: 13 right: 13 line: 2 Plus

7 left: 15 right: 15 line: 2 IntLit

j left: 3 right: 3 line: 3 Identifier

= left: 5 right: 5 line: 3 Assign

write left: 7 right: 11 line: 3 Identifier

( left: 12 right: 12 line: 3 LeftParen

i left: 13 right: 13 line: 3 Identifier

) left: 14 right: 14 line: 3 RightParen

} left: 0 right: 0 line: 4 RightBrace

1: program { int i int j

2: i = i + j + 7

3: j = write(i)

4: }

View Rubric
1-Lexer-Rubric

Criteria	Ratings						Pts
[CODE QUALITY] The code is clean and wellformatted view longer description	7 pts Excellent	5 pts Good	4 pts Fair	2 pts Poor	1 pts Very Poor	0 pts No Marks Not enoughcode waswritten to beable to evaluate	/ 7 pts
[CODE QUALITY[ Naming (and only necessary comments!)	3 pts Excellent	2 pts Good			1 pts Poor Variables or method names do not adequately describe their purpose	0 pts No Marks Not enough code was written to be able to evaluate	/ 3 pts
[CODE QUALITY] Single responsibility principle	3 pts Excellent	2 pts Good			1 pts Poor	0 pts No Marks Not enough code was written to be able to evaluate	/ 3 pts
[MANUAL TESTING] Lexer executed from the command line	24 pts Full Marks					0 pts No Marks	/ 24 pts
[UNIT TESTS] Unit tests executed against your lexer	60 pts Full Marks					0 pts No Marks	/ 60 pts
INCLUDE YOUR NAME	3 pts Full Marks					0 pts No Marks	/ 3 pts
							Total Points: 0

发表评论

电子邮件地址不会被公开。必填项已用*标注

姓名 *

电子邮件 *

验证码 *

Production Name	Previous Right Hand Rule	Updated Right Hand Rule
TYPE	'int' \| 'bool'	'int' \| 'bool' \| 'char' \| 'hex'
FACTOR	'(' REL_OP ')' \| IDENTIFIER \| CALL \| <int>	'(' REL_OP ')' \| IDENTIFIER \| CALL \| <int> \| <char> \| <hex>
REL_OP	ADD_OP \| ADD_OP '==' ADD_OP \| ADD_OP '!=' ADD_OP \| ADD_OP '<' ADD_OP \| ADD_OP '<=' ADD_OP	ADD_OP \| ADD_OP '==' ADD_OP \| ADD_OP '!=' ADD_OP \| ADD_OP '<' ADD_OP \| ADD_OP '<=' ADD_OP \| ADD_OP '>' ADD_OP \| ADD_OP '>=' ADD_OP
STATEMENT	IF \| WHILE \| RETURN \| BLOCK \| ASSIGNMENT	IF \| WHILE \| RETURN \| BLOCK \| ASSIGNMENT \| FROM

Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due

Assignment 1 - Modify the Lexer

Overview

Submission

Requirements

Requirement 1

Requirement 2

Requirement 3

Requirement 4

Requirement 5

Appendix A

View Rubric 1-Lexer-Rubric

发表评论

View Rubric
1-Lexer-Rubric