| Previous 03 Evolution of Main Languages | Chapter 2 (sections 1-12 +16). | lab03 HTML lab pages |
| 04 Syntax: grammars, EBNF, parsing | Chapter 3 sections 1 to 3 + Chapter 4 sections 1 to 3 + XBNF & SOOP Handouts | lab04 BNF on the web |
| Next 05 Semantics: UML | UML handout | lab05 UML + Graphics |
Notice that the lexical analyser (lexer) and the parser together convert the input stream of data into a tree structure -- a collection of objects linked together by pointers. When we draw UML diagrams of the semantics of a language we also describing these objects and how they will be associated.
State transition diagrams (figure 4.1) are an important part of computer science, but very easy to understand and use. These diagrams express the simplest kind of theoretical computer: the finite state machine or FSA. These are machines with limited memory (finite number of states) and star in several upper division electives in our degrees. The Unified Modeling Language (UML) has a diagram called a state machine. Here [ lexer.gif ] is Figure 4.1 drawn using an UML state machine. More on the UML later.
By the way,
XML
documents have their detailed syntax described in a form of
Extended BNF.
Assigned work due
(1) Study chapter 3, sections 3.1(intro), 3.2(problem), and 3.3(solution).
Ignore "Syntax Graphs".
Study chapter 4, sections 4.1(intro), 4.2(lexers), and 4.3(parsers).
(2) You will use XBNF in your project work. Read the attached XBNF handout. You will be using a subset of HyperText Markup Language(HTML) in the Labs. Read the handout describing a subset of the HTML. Your project will be a longer document mixing XBNF, English, and diagrams, like these two handouts. Bring copies of today's handouts on XBNF and HTML to all future classes, and laboratories. Add your own notes and bring them to the final.
(4) Study the first 8 Review Questions at the end of chapter 3 and the first 8 questions at the end of chapter 4.
(4) Hand in written copies of 3 Review Questions (any mix of chapters) at start of next class (3 points)
Your Project
In your project you will be developing a LRM (Language Reference Manual)
for a programming language that you will be inventing. What makes a good
LRM? Four things:
a+1Comments help the user understand by being informal:
an expression like `foo+bar` adds `foo` to `bar`.A grammar defines the syntax. Syntax expresses precisely the possible forms of something. Grammars are written in a mathematical metalanguage:
additive_expression::= term #(("+" | "-") term).
The
semantics
eliminates meaningless possibilities and provides
meaning to the rest. These are normally expressed informally
but in this class we will use
the Unified Modeling Language. -- -- --(UML)More on the UML later [ 05.html ] and here are some sample LRMs for well known languages [ ../samples/algol60.syntax.html ] [ ../samples/php_intro.html ] [ ../samples/python.html ] plus the syntax from other LRMs: [ ../samples/cobol.syntax.html ] [ ../samples/pascal.syntax.html ] etc.... Also see [ ../samples/ ]
Further, these four things (comments, examples, syntax, and semantics) need to be intermingled: you should be able to see the reason for a feature(comment), plus an example or two of the feature, and the rules (syntax and semantics) that apply to that feature all on the same page. We will use the UML to also provide a visual summary of the semantics of the language.
So for each part of the language that you are defining you will need:
Mini-Example: LRM for Decimal Numbers
1
512
8192
1The meaning of a digit is an interger in range 0..9 using the normal encoding.
(semantics):
The meaning of a string of n digits d = (d[n] d[n-1] ... d[3] d[2]
d[1]) is called
it's value. We can model value as a map from syntactically correct numbers
(see number above)
into the natural numbers, including 0,
So for example,
Examples vs Syntax
Note. In the final, in your project, and in class, I will often ask you to
give me an example of something. I mean a piece of code that is in the
language that shows that you know what it means. If I ask for
for(i=1; i<=n; i ++)
v+= d [i-1];is good.
If I ask for the syntax of a C++ for loop I expect something like this
I'm unlikely to ask for the semantics.... BUT instead I may ask to express a general for loop in simpler terms. This is a form of "Operational Semantics". A good answer would give the general form of for-loop:
for (A; B; C) Sand then a simpler equivalent form
{ A; while(B) {
S
C;
}
}Notice that the translation works whatever A,B,C, and S are, as long as the original form is syntactically correct.