Fundamental
Issues of Variables
Names, Bindings,
Type Checking and Scopes
Machine architecture and programming
languages
Programming languages reflect the underlying machine
Modern computers use van Neumann architecture
There are two
primary components
Memory, which stores both the program instructions and
the data
The processor, which provides instructions to modify
the data in memory
In a programming language,
The abstractions
for the memory cells are the variables
The abstractions
for the processor are functions
Fundamental issues of variables
The attributes
of variables
Name
Address
Type
Value
Lifetime
Scope
Concepts
involving variables
Alias
Binding times
Type Checking
Strong typing
Type
compatibility
Static &
dynamic scoping
Names a fundamental attribute of variables
Design issue
of the programming language
Should the names be case sensitive?
Should special names be reserved words or keywords?
A keyword can be used by the programmer as an
identifier.
A reserved word cannot.
How many characters are allowed in a name?
Which characters should be allowed in names?
Addresses another fundamental attribute of variables
The address of a variable is the memory address with
which it is associated.
This association
is made in the symbol table for static variables.
In some languages it is possible for the same name to
be associated with different addresses at different time in the program
Different
subprograms may have local variables with the same name
It is also possible to have multiple variables
associated with the same address
The variables that access the same memory location are
called aliases
Type -- another fundamental attribute of variables
The type of a variables
determines:
The range of
values the variables can have
The number of
bytes needed to store the variable
The set of
operations that are defined for values of the type
Types can be either language defined, or used defined
(in some languages)
Value -- another fundamental attribute of variables
The value of a variable is the contents of the memory
cell or cells associated with the variable
A variables value is sometimes called it
r-value because it is what is used on the right hand side of an
assignment statement
The address is the l-value, since it is
needed when a variable is on the left hand side of an assignment statement
Binding variables to their attributes
The binding time is the time in which an association
is made between variables and their attributes
Or operators and
their attributes
Bindings can take place at:
Language/compiler
design time,
language implementation time,
compile time,
load time,
link time,
run time.
When do these binding take place?
For the following
bit of code, when do you think binding takes place?
int count;
.
count = count + 5;
Count get bound to its type?
The possible values of count?
The meaning of the operator +
The value of count?
The internal representation of the
literal five?
Storage bindings and Lifetime
Binding a
variable to a memory cell is know as allocation.
There are four
ways this binding may take place
Static
variables
Stack-dynamic
variables
Explicit
heap-dynamic variables
Implicit
heap-dynamic variables
Deallocation
is the process of returning the memory cell to the operating system for
reallocation.
The lifetime of a variable is the time during
which the variable is bound to a specific memory location
The runtime-stack and the heap
There are two places storage for programs can come
from: the stack and the heap
The stack grows
downward (decreasing addresses)
The heap grows
upward (increasing addresses)
Executable program instructions and the named
variables they declare are stored on the run-time stack
Sometimes called
static RAM
Dynamic variables that are allocated during runtime
are stored on the heap
Sometimes call
Dynamic RAM
Static variables
Static variables
are bound to their memory cells by the loader which puts the program into RAM
They remain bound
to the same location until the program terminates
The association of variable name and address is kept
by the symbol table
So this binding
takes place before execution, and remains for the entire time the program is
executing.
Static variables
are very efficient, but have reduced flexibility.
Languages that have only static variables
cannot support recursion
They also do not allow storage to be shared
The lifetime of
static variables is the duration of the execution of the program.
Stack-dynamic variables
Stack-dynamic
variables are usually local variables of functions
This includes arguments to the functions
Stack-dynamic
variables bindings are created when the declaration statements are elaborated.
Elaboration refers to the storage allocation and binding process
which takes place when the code is executed.
i.e. when the function is called
When a function
is called, its local variables are allocated space on the runtime stack, and
the variable is bound to the address.
When the function
returns, all the space used by the function is deallocated.
Note that the space allocated is on the stack (static RAM), but that it is
allocated and deallocated at runtime (dynamically).
public class Recur
{
public int begin(int a)
{
if(a == 0)
return a;
else
return a + begin(--a);
}
public static void main(String [] args)
{
System.out.println(new Recur().begin(3));
}
}
Advantages and disadvantages of
stack-dynamic variables
Recursive functions require some sort of dynamic
allocation and deallocation of storage
Stack-dynamic
variables function well here
All subprograms are able to share the same memory for
their local variables.
Only those
presently executing actually use the space
The disadvantages are:
The run-time
overhead of allocating and deallocating memory
The slower access
because indirect addressing is required
It prevents
subprograms from being history sensitive (local variables do not retain their
data)
Explicit heap-dynamic variables
This is the memory allocated at runtime by a call to
the operating system
The operating
system keeps track of which memory is available on the heap, and allocates
available memory
Java and C++ ask for this memory using the operator new.
The variables allocated from the heap can be accessed
only through pointers
A statically bound pointer variable must be bound to
the data type to which it points
But the pointer
is bound to storage when they are created, which is run-time
Some languages include a means to return heap storage
to the operating system, and some do not.
int *intptr = (int *) malloc(intvar * sizeof(int));
The heap and different languages
In Java, all objects use storage on the heap;
only primitives are static variables
The storage on
the heap is accessed by reference variables
Java has no way
of explicitly destroying a heap-dynamic object; garbage collection is used.
C# has both explicit heap-dynamic and stack-dynamic
objects, all of which are implicitly deallocated.
C++ requires
the programmer to explicitly deallocated heap-dynamic objects.
The disadvantages of explicitly heap-dynamic variables
are:
The difficulty of
using pointers and reference variables
The overhead of
allocation, deallocation and indirect access
Implicit heap-dynamic variables
These variables are bound to heap storage only when
they are assigned values
All their attributes are bound every time they are
assigned
They have the highest degree of flexibility, allowing
highly generic code to be written
The disadvantages is the high run-time overhead
Scripting languages often use implicit heap-dynamic
variables
Summary so far
Fundamental
Issues of Variables
The attributes of
variables
Name,
type, address, value, lifetime, scope
Binding variables
to their attributes
Static
variables
Stack-dynamic
variables
Explicit
heap-dynamic variables
Implicit
heap-dynamic variables
Binding variables
to a type
How
the type is specified
When
the binding takes place
Type checking
Binding variables to a type
Variables must be bound to a data type before they can
be referenced.
This binding can be either static or dynamic
The data type can be specified two ways
Explicit
declaration
Variable names
are explicitly declared to be of a certain type
Most languages
since the mid 60 require this
Some exception
are Perl, JavaScript
Implicit
declaration
Implicit declarations
There is a means of associating variables with types
In Perl, the first character
in a variable name determines the type
@name is an arry, $name is a scalar, %name is a hash structure
In Fortran the first letter
of the identifier determined the data type if it is not explicitly declared
If the identifier
begins with i,j,k,l,m,n it
is an integer, otherwise it is a real
Since variable
can also be explicity declared this leads to problems
Both implicit and
explicit declarations create
static bindings to types
This means type checking can be done at compile time
Dynamic type binding
The type is not specified by a declaration, or
possible to determine implicitly
So it cannot be
determined at compile time
The variable is bound to a type when it is assigned a
value in an assignment statement.
So a variable is
bound to a type only at execution time, when the assignment statement is
executed.
The advantage here is greater flexibility,
The disadvantage is
Giving up error detection by the compiler, and
The high overhead
at runtime to typecheck
The language must
be interpreted, not compiled
Pure
interpretation typically takes at least ten times as long as to execute
equivalent machine code
Type Checking
Type checking is done when an operator operates
on two operands: X + Y
The idea of can be generalized to include
subprograms as the operators whose operands are their parameters
Assignment symbol as a binary
operator, with the left and right side as the operands.
Type checking is ensuring that the operands of an
operator are of compatible type
A compatible
type is one that is either:
Legal for the
operator, or
Is allowed under
languages rules to be implicitly converted to a legal type
This conversion is done by the compiler
The automatic conversion is called a coersion
Type checking can nearly always be done at compile
time if all the types are bound to variables statically
The more that can
be done at compile time, the faster a program will run, and hopefully there
will be fewer
runtime error
Strong Typing
Strong typing seems hard to define, but here are two
definitions:
Each name in a program has a single type associated
with it, and that type is known at compile time
Type errors are always detected (either at compile
time or run-time)
The importance of strong types is lies in its ability
to detect all misuses of variables that result in type errors.
Strong typing and languages
The coercion rules of a language have an important
effect on the value of type checking
Ada allows very little coercion
Fortran, C and C++ have a great deal of coercion
Java and C# have half as many assignment type
coercions as C++
The fewer the coercions allowed the better the error detection can
be.
System.out.println(2+3.5);
System.out.println( 4 + 5);
System.out.println("String " + 4 + 5);
Type index is 1
100;
Count : int
I : index
Type feet = float
Meters = float
Scope:
Sub Big is
X: int
Sub sub1 is
Begin
X
End
Sub 2
X : int
Begin
X
//sub1() -> Dynamic Scoping
End
Begin
X
End
{
int count = 0;
While(){
int count = 1;
}
if(count){}
}