Point 1 -- Branch Around Procedures: Since code is generated in sequential order, there must be a branch inserted to get around the code for procedures and functions when the main program first starts executing. It is assumed that program execution will start from the top when the translated program is actually run, so a jump instruction must be executed to force program execution to continue with the code that is the translation of the begin block for the main program.
Point 2 -- Procedure Declaration: The point at which a procedure definition is first encountered requires symbol table calls to put the name of the procedure and its attributes into the symbol table and then to create a new symbol table on the top of the stack for the new procedure. All of the new procedure's parameters and variables (along with their attributes) must be placed into this new table.
Point 3 -- Activation Record Initialization Time. Consider the begin block of a procedure. This is the entry point to the procedure, the code that is to start executing when this procedure is called. At this point the compiler must generate special IR related to starting a procedure running, such as dropping the label for the procedure and completing the setup of the activation record on the stack for this procedure (the initial part of the activation record setup for this procedure is done at the point of call).
Point 4 -- Procedure End: When the end of a procedure is reached, actions must be taken to begin the process of removing the activation record from the stack for this procedure (the rest is done at the point of call) and returning to the point of call.
Point 5 -- Main Program Code Start: The label generated in point 1 must be dropped here. This is to ensure that a jump can be made from the start of the translated code around any intervening procedure and function code to this part of the program, which is to be where execution starts The activation record for the main program must also be constructed at this point..
Point 6 -- Procedure Call: A procedure call requires semantic actions that generate code to set up the calling sequence properly, to start the construction of the activation record for the called procedure, to put the actual parameters into the activation record for the called procedure, and then to make the actual jump to the procedure.
Point 7 -- Procedure Return: Since code to place the actual parameters onto the run time stack is placed just before a call to a procedure, it makes sense that this part of the run time stack be handled by code that is inserted at the point of return from the call. In some languages this code will take care of copying the values in the formal parameters back into the actual parameter locations as necessary (this won't be necessary in your project, because mPascal has only "copy" (non VAR) and "reference" (VAR) parameters). The "copy" parameters don't need copied back into the actual parameters, because the actual parameters are not intended to change. The "reference" parameters don't need copying, because the original actual parameter values were changed each time the corresponding formal parameter values were changed, because the address of the actual parameter was supplied rather than its value at call time.
Perhaps more than any other point in the development of a compiler you need to be very disciplined in your mind with respect to the compiling process when considering procedures. The reason why is that in translating procedures we need to be considering three issues simultaneously:
It is easy, for example, to get the symbol table and the activation records of the run time stack confused as if they are the same thing (they are not), so just keep a clear head.
program Fred; -- Point 1: Output code to jump around possible intervening procedures and/or functionsvar a, b : Integer; x : Real;-- Point 2: Procedure definition: symbol table issues procedure Mary(a, m: in Integer; y: in out Real);var c: character;-- Point 3: Procedure code begin; drop label, generate code for completing AR setup begin -- Mary <statements> --Point 4: Procedure end; generate code for starting AR cleanup and return end;-- Point 5: Deop label for Fred's code start begin -- Fred <statements> -- Point 6: Procedure call -- generate code to start AR setup, to put actual -- parameters on the stack in the AR, and a jump to subroutine instruction Mary(3, a*b, x); -- Point 7: Procedure return -- generate code to complete parameter transfers -- if necessary and to remove the rest of the AR from the run time stack <statements> end.
Before going into detail about each of the five points let's give an overview to help map out where we are going. We will discuss these points with respect to the sample program above.
Point 1: Since code is generated on the fly as the compiler moves through the program sequentially, the code for procedure Mary will be produced and inserted into the translation file (IR_file) before the code for Fred. When our translated program begins running, though, we want it to begin with Fred's translated code. So, we must generate a label and output a jump instruction to that label to force execution to go around the code for Mary. The label needs to be dropped at the beginning of Fred.
Point 2: During parsing, the name Mary must be inserted into Fred's symbol table along with the fact that Mary is a procedure at nesting level 1. The order, type, and mode (but not the name) of each parameter must be included as attributes of the name Mary. Finally, there must be a label generated and inserted into the table that will be the branch point for calls to Mary. A new symbol table must be then created for Mary. In Mary, the names, types, modes, and offsets of the parameters must be included, too. In Mary, the parameters must be able to be accessed in a manner similar to variables, so they have offsets.
Point 3. At the beginning of the procedure for Mary, a number of things must be done. The label in the symbol table entry for Mary (in Fred) must be inserted at this point, so that the code for Mary can be jumped to properly when a call to Mary is executed. Code must be output during semantic analysis that, when executed at run time, sets up an activation record for Mary on the stack. The actual parameters must be associated properly with Mary's formal parameters.
Point 4. At the end of procedure Mary, code must be generated that, when executed, will remove Mary's activation record from the stack and reset the run time stack properly. Depending on how parameters are handled, some code might need to be inserted to ensure that the actual parameters have the proper values. Finally, the last instruction of Mary's code must be a jump that returns control to the point in the code from which Mary was called (Mary can be called from numerous different places with different actual parameters). See Point 6.
Point 5. At the start of Fred's code, the label generated in point 1 must be inserted so that the jump described in Point 1 will work as intended.
Point 6. At the point of the procedure call, the parser must determine that the call to Mary is proper (by looking in the symbol table to see if Mary is there, and if the number, types, and modes of the Mary's formal parameters match the symbol table information about Mary. If this check is ok, code must be generated by the semantic analyzer that provides the actual parameters of the call in a place where they can be accessed by the translated procedure Mary. Then, code must be generated that jumps to Mary. Usually there is a special instruction that can be inserted at this point (e.g., jsr L3, or "jump to subroutine located at label L3") that pushes the value of the PC onto the stack and then jumps to label L3). The saved PC value is the return address to be used by Mary when Mary's code has finished executing to return to the proper place from which Mary was called this time
Now, let's begin looking at these points in order.
At this point in the compiler, the semantic analyzer must be called to do three things
The label needs to be saved, because when the begin block for Fred is encountered, this label will need to be dropped into the IR file so that the jump is executed properly at run time.
Notice that as far as the code in Mary is concerned, the parameters look just like variables. One difference is that in generating code referring to identifier y in Mary, the semantic analyzer must check the symbol table to discover that y is really an in out parameter, which means that its location, D1(8) contains an address, not a value (there are other ways to translate in out parameters, for example by copying the value of the actual parameter into the formal parameter at call time, and then recopying the formal parameter value into the actual parameter value upon return). Thus, generated code referring to y must used indirect addressing. Semantic action calls must therefore be placed appropriately in a procedure definition to carry out the above actions, including:Fred's symbol table after variables a, b, and x have been compiled:
________________ | Fred | 0 | L1 | |______|___|_____|___ ______ | a | Integer | 0 | VAR | |______|_________|___|______| | b | Integer | 4 | VAR | |______|_________|___|______| | x | Real | 8 | VAR | |______|_________|___|______|The first line in the symbol table has entries corresponding to the name of this symbol table (Fred) its nesting level (0, for use as the display register number), and the label for the branch to the begin block for Fred (L1).
Fred's symbol table after the procedure definition for Mary has been compiled:
____________________ ______ | a | Integer | 0 | VAR | |______|_________|___|______| | b | Integer | 4 | VAR | |______|_________|___|______| | x | Real | 8 | VAR | |______|_________|___|______|____ ___ __ _______ __ _______ ___ ____ | Mary | | | PROC | L4 | -|->|In|Integer| |In|Integer| |in |Real| |______|_________|___|______|____|___| |__|______-|->|__|______-|->|out|___-|-> //The line for Mary in the symbol table gives the name of the identifier (Mary) no entry for the type or offset, because Mary is not a variable with a type or offset, the kind of identifier Mary is (PROC), the label for Mary (L4) so that branches to Mary can be made, and a linked list of each parameter along with its mode and type in the order in which the parameters appear.We need at least this much information in Fred's symbol table to process calls to Mary. On the other hand, we also need just about the same information in Mary's symbol table to translate Mary. In Mary's case, we will also need the names of the parameters and where these parameters will be stored (their offset in Mary's activation record).
Mary's symbol table after the procedure statement is compiled:
________________ | Mary | 1 | L6 | |______|___|_____|___ ___________ ________ | a | Integer | 0 | Parameter | In | |______|_________|___|___________| _______| | m | Integer | 4 | Parameter | In | |______|_________|___|___________| _______| | y | Real | 8 | Parameter | In Out | |______|_________|___|___________|________|As with program Fred, the first line in the symbol table for Mary includes the name of this symbol table (Mary) , its nesting level (1, for use as the display register number), and a label (L6) for branching to the begin block for Mary. Remember that there may be intervening procedures and/or functions declared inside of Mary at nesting level 2.
Mary's symbol table after the variable declaration for variable c has been compiled:
________________ | Mary | 1 | L6 | |______|___|_____|___ ___________ ________ | a | Integer | 0 | Parameter | In | |______|_________|___|___________| _______| | m | Integer | 4 | Parameter | In | |______|_________|___|___________| _______| | y | Real | 8 | Parameter | In Out | |______|_________|___|___________|________| | C | Char | 12| VAR | | |______|_________|___|___________|________|
Code must be generated here, including
Here,