[CSUSB] >> [CNS] >> [Comp Sci Dept] >> [R J Botting] >> [CSci620] >> while [Source]
[Index] [Schedule] [Syllabi] [Text] [Labs] [Projects] [Resources] [Search] [Grading]
Tue Apr 13 14:36:56 PDT 2004

The Language While

Tue Apr 13 13:03:15 PDT 2004 Acknowledgement

I want to thank the help given to me in fixing the defects on this page by the CSCI620 "Theory of Programming Languages" class, Spring 2004. They pointed out many things I have since fixed. The remaing defects are my fault.

Motivation

"While" is a rock bottom simple example language used in lots of texts and papers to demonstrate ways to define language semantics. It is deliberately kept simple so that the formlas do not become un managable. Here are some of the simplifications:
• No input or output
• No declarations of variables.
• All data and variables have integer values.
• The notation for numbers is binary.
• Arithmetic operators don't include division.
• Boolean operators don't include most relations.
• There is one loop structure and one selection structure.

In this page we explore the elementary parts prior to a full scale assault on the language using several different methods in later pages. Here it is enough to define the syntax of the language and explore the semantics of its expressions.

Here is a well known algorithm in While:

1. factorial::= "y:=1; while ~(x=1) do ( y:=y*x; x:=x-1)".

When x>0, this puts the factorial x into y. When x<=0 it fails to terminate. Ideally our semantics should be able to express both of these meanings.

The first question is how to convert a string like this into a meaningful formal structure. This is the problem of Formal Syntax. I will assume that we have done this already and move directly to expressing the formal structure and its meanings.

Abstract Syntax

Abstract syntax is a lot weaker than normal "concrete" syntax because we don't want to describe all the details of parsing constructs and resolving
` 	1 + 2 * 3`
into
` 	1 + ( 2 * 3)`
rather than
` 	(1+2)*3`
for example.

However we use the same notation for this abstraction as we do in practice to define syntax. However I omit quotation marks around +*-= <= ~ /\ ; if then else while do.

1. grammar_of_While::=
Net{
1. Num::syntax=Numbers, binary.
2. Var::syntax=A set of variables, here letters.
3. Aexp::syntax=The arithmetic expressions giving numbers as values.
4. Bexp::syntax=Boolean expressions, giving tt or ff as values
5. Stm::syntax=Statements.
6. Num::= "0" | "1" | Num "0" | Num "1".
7. Aexp::= Num | Var | Aexp * Aexp | Aexp + Aexp | Aexp - Aexp.
8. Bexp::= true | false | Aexp=Aexp | Aexp<=Aexp | ~ Bexp | Bexp /\ Bexp.
9. Stm::= Var=Aexp | skip | Stm; Stm | if Bexp then Stm else Stm | while Bexp do S.

}=::grammar_of_While.

Note: THis omits some useful things like division, modulus, input, output, etc. You can in many cases put it in fairly quickly and easily. Division means you must tackle the question of division by zero however.

Note: We just ignore the syntactic details of priorities etc. and use parentheses as necessary - without putting them in here. If this bothers you, it may help to check that you can rewrite these as tree structures and that work just like the strings we are used to (but with parentheses). This is the tradition in the theory of programming languages. In real language design a lot of effort goes into defining compound statements etc.

As an exercise: you could rewrite the definitions as concrete syntax. This means putting quotes around terminal strings like "+" and "while", fixing ambiguities (like how to parse "1+2+3"), and including rules with parentheses in expressions("2*(3+4)"), and compound statements "{"..."}".

Exercise 1

Take the factorial algorithm in While:
` 	y:=1; while ~(x=1) do ( y:=y*x; x:=x-1)`
and draw the tree representation of it.

Also, using your experience to provide intuitive semantics, show that if x=4 initially then y=24 after wards.

Exercise 2

Given that x has initial value n and y has value m write a statement in While that assigns the value of n^m=n*n*n*....*n (m times) to z. Draw the syntax tree of your statement.

. . . . . . . . . ( end of section Abstract Syntax) <<Contents | Index>>

Semantics of Constants

For numbers (n:Num) we define a meaning as a value in the integers
1. Int::= {...,- 2,-1,0,1,2,3,...}.

This version of While uses binary notation to keep it simple. As an exercise you can write the rules for decimal notation if you want.

Also, we will be forced to write negative numbers as arithmetic expressions like this "0-1"! Again this is an interesting exercise to work out how to fix the syntax above and the semantics below.

We use function N to map each Num into a unique Int. To help us to distinguish parts of the while language from pieces of logic and mathematics we write

2. N<< n >> instead of N(n) or N n.

As a rule, if we have a syntax category Syn and a semantic map S that associates a "meaning" to it from Sem, S:Syn->Sem then we write S<<s>> not S(s).

This will be a general rule... syntactic things are inside << ...>> and the symbol prefixing them indicates the semantic function being applied. This means we can omit quotation marks... instead of

3. N("1010101010" n) we can write N<<1010101010 n>>.

We say that two elements s1 and s2 of Syn are equivalent if they have the same semantics:

4. s1 equivalent s2::= (S<<s1>> = S<<s2>> ).

5. N::Num->Int, where following,
Net
For n:Num,
1. N<< 0 >> = 0,
2. N<< 1 >> = 1,
3. N<<n 0>> =2 * N<< n >> ,
4. N<<n 1>> =2 * N<< n >> + 1.

(End of Net)

So for example
Net

1. N<<101>>
2. = 2* N<<10>> +1
3. = 2*(2*N<<1>>)+1
4. = 2*(2*1)+1
5. = 2* 2 + 1
6. = 5

(End of Net)

Exercise 3

Show all the steps to evaluate
6. N<< 111 >> .

Extra Exercise

Rewrite the syntax (Num) and (N) so as to allow (1) decimal numbers, and (2) negative numbers.

Compositional Semantics

A set of equations like those defining N above is said to be compositional if it mirrors the structure of the abstract syntax. Firstly each alternative form must have a definition. Second, the RHS of each equation can only apply the semantic function to parts of LHS.

N is well defined

N is a well defined total function N:Num->Int.

Proof Method

We have a Compositional definition of N. This means that the the right hand sides of each definition is either a "basis element" or when it contains the function it is applied to parts (not the whole) of the argument on the left hand side.

Thus we can apply structural induction. Because, implicitly, the Num is the result of a finite number of applications of the the rule that defined Num, and because the definition of N follows the syntax, it means that the parts of the definition are applied only a finite number of times.

Hence, all we have to do is show that a property is

1. (1) true of the basis elements
2. (2) transmitted to each of the composite constructs from their parts.

Proof of Theorem 4

Let
1. (1) When we match the rules defining N to the string "0" only one applies and it tells that N<<0>> is zero. N is clearly uniquely defined when applied to "0".
2. (2) Similarly with "1" These are the two Basis elements taken care of.

3. (3) What of a string n = n1 "0"? Well suppose we know that N is defined for n1. Now only one rule matches and it says that N<<n1 0>> =2*N<<n1>> , which is a well defined value because N<<n1>> is well defined.
4. (4) Similarly when n= n1 "1".

Now every element of Num is one of the above forms, and in each case there is only one rule to apply, and in each case it produces a single answer.

(Close Let )

. . . . . . . . . ( end of section Proof of Theorem 4) <<Contents | Index>>

. . . . . . . . . ( end of section N is well defined) <<Contents | Index>>

Equivalence of Numbers

You can show that N is a one-one map and so two constants are equivalent precisely when they are identical in this simple language

. . . . . . . . . ( end of section Semantics of Constants) <<Contents | Index>>

Semantics of Variables

We express the state of the system as a function that maps each Var into an Int :
1. State::=Var->Int .

Notice that we usually omit the parenthsis so that if s: State and x: Var, then s x = s(x).

So for s:State, x:Var, s x= 3 iff the value of x in s is 3.

We can write states as a list of variable+>value pairs. For example

2. [x+>1, y+>2, z+>3] = the State where x is 1, y is 2 and z is 3.

So
Let

1. (1): s = [x+>1, y+>2, z+>3].

and it follws that

2. (1) |- s x = 1.
3. (1) |- s y = 2.
4. (1) |- s z = 3.

(Close Let )

Note: in languages where variables are declared we have to make the State more complex.

Updating a State by Changing a Variable

The update operation on states is defined by is written [y+>v] for a variable y and and a value (Int) v.

For s:State, x:Var, v:Int , s[y+>v] :: State, where
Net

1. (s[y+>v])y = v,
2. For x:Var - {y}, (s[y+>v])x = s x.

(End of Net)

As a result:

3. ([x+>1, y+>2, z+>3])[x+>5] = [x+>5, y+>2, z+>3].

We can express complex state changes by listing a series of these update operations. So for any state s, variables x and y, and constants a and b: s[x+>a][y+>b] ::= (s[x+>a])[y+>b].

Fact: if there are several updates.... we only need the last one: For example if s is a state, x and y are different variables and a and b constant values then
Net

1. s [x+>a][x->b] = s [x+>b]
2. s [x+>a][y->b] = s [y+>b][x+>a]

(End of Net)

We will use this notation to define the semantics of assignment statements later.

. . . . . . . . . ( end of section Semantics of Variables) <<Contents | Index>>

Semantics of Expressions

An expression's value is not constant because it depends on the values of the variables in it. Thus the semantics of an expression must include the State of the system as well as the Aexp. So we can not (like N:Num->Int ) just define it to be a map from Aexp into the Int s. It must also take the state of the system into account. The traditional approach is to take,
1. A:Aexp->(State ->Int), and write
2. A<<e>> s::= the value of e in state s.

Here is the formal definition:

Definition of Semantic map A

1. A::Aexp->(State ->Int ) = following,
Net
For n:Num, s:State, x:Var, a1,a2:Aexp.
1. A<< n >> s = N<<n>> , see the definition of N above.
2. A<< x >> s = s x, see definition of s x
3. A<< a1+a2>> s = A<<a1>> s + A<<a2>> s,
4. A<< a1*a2>> s = A<<a1>> s * A<<a2>> s,
5. A<< a1-a2>> s = A<<a1>> s - A<<a2>> s,

(End of Net)

Note: the +*- on the LHS, inside <<...>> is a character. But on the RHS they become the normal arithmetic operations.

. . . . . . . . . ( end of section Definition of Semantic map A) <<Contents | Index>>

Recall: the +*- symbols on the LHS of an equation (inside <<...>>) are syntactic characters. On the RHS they are operations on integers.

Notice that the definition of A is compositional. Every syntactic form is defined in terms of its components.

Example 5

What is A<<x+1>> s when s x=3.
Net
1. A<<x+1>> s = A<<x>> s+A<<1>> s
2. = s x +A<<1>> s
3. = s x +N<<1>>
4. = s x + 1
5. = 3 + 1
6. = 4

(End of Net)
Note that the above is in rigorous detail... with practice you take many steps at one time... just as in algebra.

Notice that we did not have to define anything but s x to work out the value of x+1. Clearly this is an important property of the meaning of an expression in a clear programming language: It's value should not depend on variables that are not in the expression.

Example 6

Consider two approaches to adding a unary negation operator to the While language. If we define, for all expressins a:
3. A<< - a >> s = A<< 0 - a >> s

The above is not compositional (and proofs become harder as a result). Instead we prefer to write

4. A<< - a >> s = 0 - A<< a >> s

Of course they both lead to the same values under all circumstances in this case... but in general departing from compositional definitions can lead to meaningless (or difficult to handle) definitions.

Exercise 7

Prove that equations do give a well defined function A:Aexp->(State- >Int ).

In other words show that for each a:Aexp, s:State, there is only one i:Int with A<<a>> s=i. Use structural induction to do this.

Hint: How did we do the second step for N?

Equivalence of Arithmetic Expressions

Two arithmetic expressions b1,b2:Aexp are equivalent if they have the same semantics:

5. For a1,a2:Aexp, a1 equivalent a2::= (A<<a1>> =A<<a2>> ).

Note. Here we compare two functions State->Int for equality. Two functions are equal when they have the same domain and co-domain and the equal values for all arguments. So in this case, two expressions are equivalent if they always have the same values in every state:

6. (above) |- For a1,a2:Aexp, a1 equivalent a2 iff (for all s:State( A<<a1>>s=A<<a2>> s )).

. . . . . . . . . ( end of section Semantics of Expressions) <<Contents | Index>>

Semantics of Boolean Expressions

Here we define B:Bexp->(State->{tt,ff}) where tt and ff stand for true or false respectively. Notice that all that matters is that these are two different things. Think of them as enumerated data types in Pascal, Ada, C, or ML. However we don't go as far as C and muddle up tt and ff with 0 and 1!

Definition of Semantic Map B

1. B::Bexp->(State ->Int ) = following,
Net
For b1,b2,b:Bexp, s:State, x:Var, a1,a2:Aexp,
1. B<<true>> s = tt,
2. B<<false>> s=ff,
3. if A<<a1>> s=A<<a2>> s then B<<a1=a2>> s=tt,
4. if A<<a1>> s!=A<<a2>> s then B<<a1=a2>> s=ff,
5. if A<<a1>> s<=A<<a2>> s then B<<a1<=a2>> s=tt,
6. if A<<a1>> s>A<<a2>> s then B<<a1<=a2>> s=ff,
7. if B<<b>> s=tt then B<<~b>> s=ff,
8. if B<<b>> s=ff then B<<~b>> s=tt,
9. if B<<b1>> s=tt and B<<b2>> s=tt then B<<b1 /\ b2>> s=tt,
10. if B<<b1>> s=ff or B<<b2>> s=ff then B<<b1 /\ b2>> s=ff.

(End of Net)

. . . . . . . . . ( end of section Definition of Semantic Map B) <<Contents | Index>>

Exercise 8

Assume that s x = 3, determine B<< ~(x=1) >> s.

Exercise 9

Prove that the Definition of Semantic Map B above defines a well defined function: Bexp->(State->{tt,ff}).

Equivalence of Boolean Expressions

As above two boolean expressions b1,b2:Bexp are equivalent if they always have the same values in every possible state:

For b1,b2:Bexp, b1 equivalent b2 iff (for all s:State( B<<b1>>s=B<<b2>> s )).

Exercise 10

Extend Bexp

Suppose we extended Bexp to Bexpp where the abstract syntax of Bexpp is
1. Bexpp::=Bexp | Aexp != Aexp | Aexp >= Aexp | Aexp < Aexp | Aexp > Aexp | Bexpp \/ Bexpp | Bexpp => Bexpp | Bexpp <=> Bexpp.

( b1=>b2 should stand for b1 implies b2 in the sense of symbolic logic, and

2. b1 <=> b2 is the "iff" or "logical equivalence" operator).

Give compositional definitions for the semantics of Bexpp.

Same power

Prove that for every b1:Bexpp, there is at least one b2:Bexp such that b1 equivalent b2.

. . . . . . . . . ( end of section Exercise 10) <<Contents | Index>>

. . . . . . . . . ( end of section Semantics of Boolean Expressions) <<Contents | Index>>

Properties of A and B

There are two key properties of expressions that we will need later. First the value of an expression does not depend on the variables that are not in it. Second a change in a variable that occurs in an expression is reflected by a similar change inside an expression where it occurs. To express these we need some more formally defined ideas: Free Variables and Substitution.

Free Variables

Associated with any expression are the variables that appear in that expression. These are called free variables (there are also bound variables in more complex expressions, but they do not appear in our Aexp and Bexp). We formulate this as the map FV below:
1. FV::Aexp-> @(Var). -- FV maps each Aexp into a set of Var.
Net
For n:Num, x:Var, a1,a2:Aexp, op:{"+","-","*"},
1. FV(n) = {}, -- a number has no free variables -- the empty set.
2. FV(x) = {x}, -- a variable is a free variable
3. FV( a1 op a2) = FV(a1) | FV(a2). -- Union of two sets.

(End of Net)
For example
2. FV(x+1)={x} and FV(x+x*y)={x,y}.

Quick Exercise

Is the above definition compositional? Why/Why not?

Lemma 11

Let s1 and s2 be two states and a an arithmetic expression.

For two States s1, and s2, if for all x:FV(a), s1 x = s2 x then A<<a>> s1 = A<<a>> s2.

Extra Exercise

Define FV for Boolean Expressions

Exercise 12 (Essential)

Using your definitions show that this is true. Let s1 and s2 be two states and b an Boolean expression.

If for all x:FV(b), s1 x = s2 x, then B<<b>> s1 = B<<b>> s2.

Substitution

.Dangerous_Bend There is a serious abuse of notation coming up. Fasten seat belts, put your analyst on danger money, and relax.

We (like all good books on logic, meta-mathematics, symbolic processing, AI and LISP) need to define and study the effect of changing an expression a by taking a variable y in a and replacing it by another expression a0.

a[y+>a0]::Aexp=Result of replacing all y in a by a0.
Net

For n:Num, x:Var-{y}, a0,a1,a2:Aexp, op:{"+","-","*"},
1. n[y+>a0]= n, a number is not changed by substituting a variable.
2. y[y+>a0]= (a0), the variable is replaced by the expression.
3. x[y+>a0]= x, other variables are not changed.
4. ( a1 op a2)[y+>a0] = (a1)[y+>a0] op (a2)[y+>a0], change each part of the expression in parallel.

(End of Net)
For example
3. (x+1)[x+>3] = (x)[x+>3] + (1)[x+>3]
4. =3+1
5. (NOT 4!),

But for all states s: A<<(x+1)[x+>3]>> s = 4.

Indeed: substitution and assignment can have similar effects

6. A<<x+1>> ( s [x+>3] ) = 4

Another example

7. (x + y * x)[x+>(y-5)] = (y-5)+y*(y-5).

(Again no simplification!)

Quick Substitution Exercises

S1

Let s1=[x+>1, y+>2, z+>3] and s2=[x+>1, y+>4, z+>3], show that
1. s1[y+>4] = s2. Hint: When are two functions equal?

S2

Let s1=[x+>1, y+>2, z+>3] and s2=[x+>5, y+>2, z+>3], show that
2. s1[x+>(s1 y + s1 z)] = s2.

S3

Let s1=[x+>1, y+>2, z+>3], s2=[x+>1, y+>2, z+>4], and
3. s1[___+> s __+2] = s2. Fill in the blanks (_) above with variables(not numbers).

. . . . . . . . . ( end of section Quick Substitution Exercises) <<Contents | Index>>

Exercise 13

Prove that for all s1,s2,a,y,a0,...
8. A<< a[y+>a0] >> s1 = A<<a>> s2 where s2 = s1[y+>A<<a0>> s1].

Exercise 14

Define substitution for boolean expressions (Bexp): For b:Bexp, y:Var, a:Aexp, b[y+>a]::Bexp=result of replacing y by a0 throughout. Prove that your definition satisfies
9. B<< b[y+>a] >> s = B<<b>> ( s[y+>A<<a0>> ] ) for all states s.

. . . . . . . . . ( end of section Properties of A and B) <<Contents | Index>>

Semantics of Statements

Here is a simple -- non-compositional -- semantics defining (?) the meaning of statements is Stm, where

1. Stm= Var=Aexp | skip | Stm; Stm | if Bexp then Stm else Stm | while Bexp do S.

We do this by defining a map D that maps each Stm in to a change of State:

2. D <<S >> s =the next state after s if and only if the statement S terminates when started in state s.

This implies that D<<S>> is a partial map, indicated with a '<>->' arrow.

3. D::Stm->(State <>-> State), where following
Net
1. For S, S1, S2:Stm, s:State, b: Bexp.
2. D <<x:=a >> s = (s[x+> A <<a >> s]), Take the state and change x in the state.
3. D <<skip >> s = s.
4. D <<S1;S2>> s = D <<S2>>( D <<S1>> s).
5. D <<if b then S1 else S2>> s = if( B <<b >>s , D <<S1>>s , D <<S2>>s ).
6. D << while b do S >> s = if( B << b >>s , D<<while b do S>>( D << S >>s) , s).

(End of Net)

The above depends on the following definition:

4. if(p, a, b)::= the conditional function, if p is true then a else b.

Notes on D

The definition of D <<x:=a >> implies that if x=y then D <<x:=a >> s y = A <<a >> s but if not x=y then D <<x:=a >> s y = s y. Similarly for all variables x, D <<skip >> s x = s x.

The ";" rule defines a unique successor state if and only if both component are defined. Also note that the order of S1 and S2 are reversed on opposite sides of the definition.

The rule for while b do S is a natural rule and can be used to derive the meaning of a program that terminates. For example consider

5. factorial = "y:=1; while ~(x=1) do ( y:=y*x; x:=x-1)".

Let s = [x+>3], then
Net

1. D<<factorial>> s
2. = D<<while ~(x=1) do ( y:=y*x; x:=x-1)>> o D<<y:=1> [x+>3]
3. = D<<while ~(x=1) do ( y:=y*x; x:=x-1)>> [x+>3, y+>1]
4. = if( B<<~(x=1)>> [x+>3, y+>1], D<<while ~(x=1) do ( y:=y*x; x:=x-1)>>(D<<y:=y*x; x:=x-1>> [x+>3, y+>1]), [x+>3, y+>1])
5. = if( true, D<<while ~(x=1) do ( y:=y*x; x:=x-1)>>(D<<y:=y*x; x:=x-1>> [x+>3, y+>1]), [x+>3, y+>1])
6. = D<<while ~(x=1) do ( y:=y*x; x:=x-1)>>(D<<y:=y*x; x:=x-1>> [x+>3, y+>1])
7. = D<<while ~(x=1) do ( y:=y*x; x:=x-1)>> [x+>2, y+>3]
8. = if( B<<~(x=1)[x+>2, y+>3], D<<while ~(x=1) do ( y:=y*x; x:=x-1)>> (D<<y:=y*x; x:=x-1>>[x+>2, y+>3]),[x+>2, y+>3])
9. = if( true , D<<while ~(x=1) do ( y:=y*x; x:=x-1)>> (D<<y:=y*x; x:=x-1>>[x+>2, y+>3]),[x+>2, y+>3])
10. = D<<while ~(x=1) do ( y:=y*x; x:=x-1)>> (D<<y:=y*x; x:=x-1>>[x+>2, y+>3])
11. = D<<while ~(x=1) do ( y:=y*x; x:=x-1)>> [x+>1, y+>6]
12. = if( B<<~(x=1)[x+>1, y+>6], D<<while ~(x=1) do ( y:=y*x; x:=x-1)>> (D<<y:=y*x; x:=x-1>> [x+>1, y+>6]), [x+>1, y+>6])
13. = if( false, D<<while ~(x=1) do ( y:=y*x; x:=x-1)>> (D<<y:=y*x; x:=x-1>> [x+>1, y+>6]), [x+>1, y+>6])
14. = [x+>1, y+>6].

(End of Net)

The above definition can be used but is not compositional. The equation for while defines the meaning D << while b do S >> in terms of D << while b do S >>. So, if there is a non-terminating loop, then semantic equations, themselves, tend to not terminate!

If one tried the deraivation above with s = [x+> (-3)] for example, we keep generating new formulae, for ever. As a result it is not clear if a program containing a while is well-defined or not. It took Scott and Strachey's theory to fix this problem. They invented their theory of Domains, complete partial orders, and lattices etc. so that we would have a compositional semantics. They develope a abstract and elegant theory of the kinds of functions that are the meanings of programs. For this more complex Direct Denotational Semantics see [ denotational.html ] and [ fixed.html ] (under development).

. . . . . . . . . ( end of section Semantics of Statements) <<Contents | Index>>

. . . . . . . . . ( end of section The Language While) <<Contents | Index>>

Glossary

1. BNF::="Backus-Naur Form", for syntax and grammar, developed by Backus and Naur.
2. EBNF::="Extended " BNF.
3. HTML::= "HyperText Markup Language", used on the WWW.
5. Java::="An " OO " Language from Sun".
6. LISP::= "LISt Processing Language".
7. LRM::="Language Reference Manual".
8. OO::="Object-Oriented".
9. Prolog::="Programming in Logic".
10. TBA::="To Be Announced".
11. UML::="Unified Modeling Language".
12. URL::=Universal_Resource_Locator,
13. Universal_Resource_Locator::syntax= protocol ":" location, where
Net
1. protocol::= "http" | "ftp" | "mailto" | ... ,
2. location::= O( "//" host) O(pathname).

(End of Net)
14. WWW::= See http://www.csci.csusb.edu/dick/cs620/, index to web site for this class.
15. XBNF::="eXtreme" BNF, developed by the teacher from EBNF, designed to ASCII input of syntax, semantics, and other formal specifications.