Scanner III, Completing the Scanner
Posted Thursday, February 5
Due at the beginning of your lab period the following week
Objectives
The objectives of this assignment are
- completing your scanner
- testing your scanner
By the time this assignment is done, your scanner should be completely functional,
ready to work with your parser (to come).
To Do
The following are the tasks to complete.
- Make sure your scanner meets the requirements of last week's assignment.
- Be sure you have covered all of the tokens in µPascal. Look at the
token list on the Resources page to be sure.
- Distinguish between keywords and identifiers.
- Be sure to ignore case
- After scanning an identifier, compare the identifier to a list of reserved words and
return the proper reserved word or identifier token.
- Handle comments.
- Comments begin with { and end with }.
- Ignore comments while scanning
- Watch for runaway comments and set the token value to the new error token
MP_RUN_COMMENT if this
occurs (if the end of file is encountered before the end of a comment is reached, a
runaway comment error has occurred).
- Be sure to update the line and column numbers properly while scanning
comments
- Handle strings.
- Be sure that you compute the lexeme to be only the string between the opening and
closing apostrophes (do not include these apostrophes).
- Watch for runaway strings (if you encounter the end of a line before the closing
apostrophe is found, the string is a runaway string).
- Set the token value to the new error token MP_RUN_STRING if a runaway string is
found.
- Report scanning errors.
- If the dispatcher cannot dispatch to an fsa because the first
character for the current token scan does not start any valid token, the
token value is to be set to the new error token MP_ERROR.
- If a valid token cannot be scanned (the fsa scanning that token does not pass through any
accept state before an invalid character is found, as described in the
lecture) the scanner is to set the token value to MP_ERROR. The
lexeme should be the invalid character at the start of the scan, and the
row and column numbers should indicate where that character was found.
You should also print a meaningful error message in the driver based on
the information in the lexeme, row number, and column number.
- Recover from scanning errors.
- If a runaway comment is encountered, the driver should print an
appropriate error message, noting where the comment started. The
scanner should leave the file pointer pointing to the end of file
character. Then, when the driver calls the scanner again, the
scanner will return the end of file token, and the driver will terminate
as it usually would upon receiving this token.
- If a runaway string is encountered, the driver should indicate where the string
started and give an appropriate error message. The file pointer
should be left pointing at the end of line character that terminated the
runaway string.
- For other scanning errors, the scanner should leave the file pointer
pointing to first character after the character examined by the
dispatcher.
- Your program must be able to create a printout exactly as last week for files scanned.
- In this case, error messages are to be listed in this same file on separate lines in between the lines for
valid tokens. The error messages should indicate the line and
column numbers of where the error was encountered.
Future Considerations
Think about how you would produce a source listing of the program you are scanning so
that the program looks just like it is entered by the programmer. As errors occur,
they should be noted by inserting an error line right below the source line with the
appropriate error message and a mark (^) pointing to the start of the problem.
You can extend your scanner to do this, but it is not a requirement.
Special requirements
- If you find any discrepancies or errors in the assignment, be sure to report them
immediately, so that the page can be updated appropriately.
- You should be developing some good test cases for your scanner. You
can post an announcement on the news group that you are willing to trade test cases with others who have
developed some.
To Turn In
As usual, be sure that your source and executable files are available on esus by
the start of your next lab period. Be ready to test
your program in lab against supplied test files.