[CSUSB]
>> [CNS]
>> [Comp Sci Dept]
>> [R J Botting]
>> [CSci620]
>>
06
[Source]
Here is a rough translation of the pseudo-pascal on page 66 into C++:
class Token; //defined later
Token accept();//global function gets next Token.
Token lex;
bool S()
{
if (lex.get_type() == CONSTANT)
{
lex=accept();
if( lex.get_value() == "+" )
{
lex=accept();
return S(); //recursive. don't panic.
} else
return lex.endOfFile();
}
} else
return false;
}
Here is a rough description of
class Token {
private:
TokenType t;
string value;
boolean eof;
public:
TokenType get_type(){return t;}
string get_value(){return v;}
bool endOfFile(){return eof;}
//...
};
typedef enum {CONSTANT, VARIABLE, OPERATOR, EOF} TokenType;
Pages 76 to 81
Syntax improvement is part of our compilers classes.
I do not expect you to recall how to do this in the final.
I'm much more interested in making sure you know how a good
syntax is parsed.
Pages 82 to 87
The book starts with incomplete Pascal and grows it
statement by statement. Do not expect the begins and ends to match
until finished.
The syntax of Pascal is defined on [ pascal.syntax.html ] with the lexemes in [ pascal.lexicon.html ] and I also have collected a lot of working examples [ http://www.csci.csusb.edu/dick/cs320/pascal/ ] , enjoy!
. . . . . . . . . ( end of section Notes) <<Contents | Index>>
Questions
The term came to fame when Sun released a JIT compiler for Java.
The idea JIT was around in the 1990's as a way to do business
where you aimed to not order and store raw materials ready to
meet demand. Instead when the demand for a good arrived you ordered the
raw materials and produced it "Just in time" to sell it.
Define procedure and function
In computer science (and in particular Algol, Pascal, and Ada)
sub-programs are divided into two types:
This is different to the C/C++ terminology where all
subprograms are functions, and functions that don't return anything
are called void functions.
Where are the assemblers?
The theory of programming languages has been focused on high
level languages for 3 or 4 decades. The theory of assemblers
is not very interesting. In practice, assemblers have been used
less and less. Assemblers are studied in other topics: System
Programming, Machine Architectures.
Does a lexer and parser use the same grammar?
No. The original BNF for the language is split into two
levels to prepare for writing a translator. The grammar
for the lexer describes lexemes in terms of characters.
It is typically a regular grammar and is often
expressed as a set of regular expressions. So, in a way,
the lexemes are at the top of the lexical analysis grammar.
The grammar for the parser defines the language in terms of the
lexemes. So the lexemes are at the bottom of this grammar. The
start symbol, typically stands for program.
On page 68 the function S has one push and two pops -- can it work
Yes. Before the two pop() functions, there is
if( S() ){ Op2=pop(SAS); Op1=pop(SAS); ...}
For example:
class Quad{ public: char operator; int operand1, operand2, result};
typedef vector<Quad> QuadTable(3000);We can use a list instead of a vector.
S::=...... | Iis short for identifier and fits with the set listed on the line but one above it. It becomes V again in the grammar G describing the lexemes for the language.
However recursive descent will fail to execute correctly if the grammar is not deliberately adjusted first. For example a definition like
train::=train carriage| engine carriage.would have code like this:
bool train()
{ if( train() )
return carriage();
else
{ if (engine())
return carriage();
else return false;
}
}The recursion on the second line is reentered until the program runs out of stack space.
Similarly, a common prefix is needed to distinguish to alternatives.
Similarly with working out the selection sets.
on page 76 I don't understand how the recursion is eliminated
We don't eliminate all recursion! If we could we would have a Type 3
or regular language. No interesting languages are regular! In the
example
E::=E P | Pwe replace it by
E::=P E'
E' ::= P E' | empty
What we do is: replace left recursion by right recursion.
We do this because left recursion does not lead to an infinite loop
in the parser!
Explain "deterministic ... one symbol lookahead parser" on page 77
A deterministic program can evaluate all conditions before it
makes a choice between alternatives. The one symbol lookahead means
that the program can make a choice as to the next action
by looking at the next symbol, and nothing else on the input.
Programs that only look one symbol ahead to make choices are easier to write because there can be no backtracking.
In the examples shown the production
T P' | ( E ) P' | () P'has two alternatives with a common first symbol("("). This means that looking at the "(" doesn't tell the program whether to parse "( E ) P'" or "() P').
Similarly in the definition of T on page 77.
Here is an extreme and silly example of a system that needs more
than a single lookahead:
Net
You have to write a program that switches the trains correctly.
Now the two difficulties. (1) the trains emerge from a long tunnel and the switch is close to the tunnel. You must make a choice the moment the front of the train is visible. (2) The second train is at the back of the train:
Basically you need to negotiate a change of specification before you try to program this problem....
This problem is courtesy Michael A Jackson's books, methods, and training.
The idea is figuring out how to handle productions that have an empty in them. In essence you list the things that might be on the other side of the empty.
On page 78 and 79 what is the dollar sign for?
The dollar sign is the symbol use in compiler theory and
UNIX practice to indicate the end of a string or file.
. . . . . . . . . ( end of section Questions) <<Contents | Index>>
Exercises
. . . . . . . . . ( end of section Exercises) <<Contents | Index>>
Laboratory
Research on lexical and syntactic analysis,
[ lab06.html ]
TBA
Next
Study Chapter 4 pages 87.. semantic interpretation.
[ 07.html ]
. . . . . . . . . ( end of section CSCI620 Session 6) <<Contents | Index>>
Glossary