Select this to skip to main content [CSUSB] >> [CNS] >> [Comp Sci Dept] >> [R J Botting] >> [CSci620] >> intro [Source]
[Index] [Schedule] [Syllabi] [Text] [Labs] [Projects] [Resources] [Search] [Grading]
Fri Apr 2 15:56:21 PST 2004


    Introduction to Semantics of Programming Languages


      Giving a formal or mathematical version of documentation reveals obscurities, ambiguities, complexities, and impossibilities in apparently clear documents. The formal documentation forms a reliable and clear basis for implementation and verification of the object documented. Mathematical documentation is also the basis for the analysis of the thing being documented and predicting its properties.

      These uses of general formal specification have proved true when used to define the syntax of programming languages. All programming language syntax are described by:

      However programming language semantics is not defined like this, research has not demonstrated that there is a uniquely better notation (like EBNF) and theory(context free grammars). Semantics is still a challenging area. It demands more complicated mathematical machinery than syntax. Indeed, of all areas of formal specification, programming languages need the deepest mathematics.

      In Semantics we take the syntax as given and attempt to write down rules that let us assign a "meaning" to syntactically correct pieces of a language. Typically we describe the syntax in a BNF-like form. The goal is then to use this to provide the structure the semantics as well. For example if we define binary numbers by this XBNF grammar

    1. Num::= "0" | "1" | Num "0" | Num "1"

      We would probably define the meaning as a map N from Num into the integers including 0:

    2. N::Num->{0,1,2,3...},
    3. N("0")=0,
    4. N("1")=1,
    5. N(n "0")= 2* N(n),
    6. N(n "1")= 2* N(n) +1. Notice that we have one semantic rule for each alternative syntax so we can be sure that we have defined a meaning for all numbers constructed according to the syntax of Num. As long as we trust the recursive rules to terminate.

      So for example:

    7. N("0101")=
    8. 2*N("010")+1=
    9. 2*(2*N("01"))+1=
    10. 2*(2*(2*N("0")+1)+1=
    11. 2*(2*(2*0)+1)+1= 2*(2*0+1)+1 = 2*(0+1)+1 = 2*1+1 = 3.

      Here are the main ways of defining semantics:

      Types of Semantics

        Denotational semantics

        The meanings assigned to constructs are modeled as mathematical objects that lie in a particular kind of "space". The objects represent the effect of

        Axiomatic Semantics

        Here specific properties of the effect of executing the constructs are expressed as assertions. This ignores even more details. The traditional form is to use Hoare Triples: {Before}Construct{After}. The Before condition is called a precondition and the After is a post-condition. Dijkstra has pioneered a newly popular Calculational approach where for each Construct and possible After you want the weakest (most general, loosest) precondition is defined.

        Operational Semantics

          The meaning of a construct is specified by the computation it induces when it is executed on a machine. It focusses on how the computation is produced. However this is (nowadays) in abstract and not on any real of theoretical machine.

          Structural Operational semantics

          This describes the meaning in terms of the step by step changes the initial and final states of the computation.

          Natural Operational semantics

          This is more abstract and in terms of the overall effect rather than the single steps.

        . . . . . . . . . ( end of section Operational Semantics) <<Contents | Index>> executing the construct however.

      . . . . . . . . . ( end of section Types of Semantics) <<Contents | Index>>

      Assignment and Sequence


      1. sequence::=statement | statement ";" sequence.
      2. statement::=assignment.
      3. assignment::= variable ":=" expression.
      4. expression::= variable | constant.

        Operational Semantics

          Informally we have:
          1. The state of the program is described by a function that records the value currently assigned to each variable.
          2. To execute a sequence, execute them one at a time from left to right.
          3. To execute an assignment, determine the value of the expression and attach it to the variable.
          4. The value of a constant is itself and the value of a variable is the value assigned to that variable

          (End of Net)

          Structural Operational semantics

          We use the following formula to express that S changes state s to get s':
        1. <S,s>=>s'.


        2. <S1,s1>=><S2,s2>

          shows that the first step of executing S1 in state s1 is to change state s2 and continue with S2 next.

          To reason out the effect of

        3. <S1;S2;S3..., s> For example:
        4. < z:=x; x:=y; y:=z, [ x+>5 | y+>7 | z+>0 ]>
        5. => < x:=y; y:=z, [ x+>5 | y+>7 | z+>5 ]>
        6. => < y:=z, [ x+>7 | y+>7 | z+>5 ]>
        7. => [ x+>7 | y+>5 | z+>5 ] This is often called "small-step" semantics. As you can see we, step by step, derive that
        8. < z:=x; x:=y; y:=z, [ x+>5 | y+>7 | z+>0 ]> =>^3 [ x+>7 | y+>5 | z+>5 ].

          To define these properly we will write down "axioms" like this: For v1,v2:variables, s:Variables->Values, < v1:=v2, s > => s[v1+> s v2].

          To define the idea of "execute, one at a time..." we need a derivation rule like this

        9. For S1:Statement,S2:Sequence, s1,s2,s3:Variables->Values,
        10. if <S1,s1>=>s2
        11. then <S1;S2, s1>=><S2, s2>. (meaning:"If S1 turns s1 into s2 in one step then S1;S2 applies S1 giving s2 and then executes S2 in that state. We will see the rest of these rules in later.

          Natural Operational semantics

          The so called "natural" semantics are natural to write down, not to use! The make us almost analyze the execution of the program backward from what we know to be the result.

          The notation is

        12. <S,s>->s' which mens that if S is started in state s then it terminates with a state of s'. So if we know that
        13. < z:=x; x:=y; y:=z, [ x+>5 | y+>7 | z+>0 ]> -> [ x+>7 | y+>5 | z+>5 ] then the rules (later) show us that the last step was y:=z and
        14. < z:=x; x:=y; [ x+>5 | y+>7 | z+>0 ]> -> [ x+>7 | y+>7 | z+>5 ]
        15. and
        16. < y:=z, [ x+>7 | y+>7 | z+>5 ]> -> [ x+>7 | y+>5 | z+>5 ] We can repeat the analysis another step... However we present the result of the analysis as a "Derivation Tree":
        17. <z:=x, s570> -> s575> <x:=y, s575> -> s775
        18. -------------------------------------------
        19. <z:=x; x:=y, s570> -> s775 <y:=z,s775>->s755
        20. -----------------------------------------------------------
        21. < z:=x; x:=y; y:=z, s570> -> s755 Note, that you need to abbreviate the state descriptions to make this fit on a page.

          Here each horizontal line is the result of applying one of the rules that appear as part of the "natural semantics" of sequences:

        22. For S1:Statement,S2:Sequence, s1,s2,s3:Variables->Values,
        23. if <S1,s1> -> s2 and <S2,s2> -> s3
        24. then <S1;S2, s1> -> s3.

        . . . . . . . . . ( end of section Operational Semantics) <<Contents | Index>>

        Denotational semantics

        In the denotational semantics of this simple language we define a function, piece by piece that describes the effect of S as a map
      5. D[[ s ]] : (Variables->Values)->(Variables->Values). So
      6. D[[ S ]] s = s' is true precisely when S terminates and changes state s into state s'.

        Typical denotational rules are D[[ v1 := v2 ]] s ::= s[v1+> s v2] and D[[S1; S2]] ::= D[[ S2 ]] o D[[ S1 ]] where 'o' indicates function composition:

      7. (f o g)x = f(g(x)). Notice that these are equations, rather than derivation (if_then) rules. In our previous example

      8. D[[z:=x; x:=y; y:=z]] [ x+>5 | y+>7 | z+>0 ]
      9. =D[[x:=y; y:=z]] (D[[z:=x]] [ x+>5 | y+>7 | z+>0 ])
      10. = D[[ x:=y; y:=z ]] [ x+>5 | y+>7 | z+>5 ]>
      11. = D[[ y:=z ]] D[[ x:=y]] [ x+>5 | y+>7 | z+>5 ]>
      12. = D[[ y:=z]] [ x+>7 | y+>7 | z+>5 ]>
      13. = [ x+>7 | y+>5 | z+>5 ] There is a certain elegance about this approach... however it means that when we model iteration we have to figure out the meaning of iterating functions before we can write down the equations.

        Axiomatic Semantics

        Axiomatic semantic was a very early idea - Floyd's "assigning meanings to flowcharts" and Hoare's Axioms are seminal work in the theory of programming and programming languages.

        Hoare Axiomatic Semantics

        Hoare uses a notation like:
      14. {Before}Construct{After} where the Before and After are statements that can be true or false and contain variables - predicates. For example, we expect the following to be true:
      15. { x=5 } y:=x { y=5 and x=5 } Indeed we have rules like: For all predicates P1,P2,P3 and statements S1, S2,
      16. if { P1 } S1 {P2} and {P2} S2 {P3}
      17. then {P1} S1;S2 {P3}. And (perhaps) axioms like For all values u,v,w and variables x,y,z,
      18. { x=v and z=w and y=u } y:=x {x=v and y=v and z=w}.

        Using these we can prove a general result - that the code

      19. z:=x; x:=y; y:=z swaps the values of x and y. As a triple this reads:
      20. {x=n and y=m} z:=x; x:=y; y:=z { x=m and y=n}.

        Proof 1

      21. {x=n and y=m} z:=x {x=n and y=m and z=n} -- Axiom
      22. {x=n and y=m and z=n} x:=y {x=m and y=m and z=n} -- Axiom
      23. {x=n and y=m} z:=x; x:=y {x=m and y=m and z=n} -- rule
      24. {x=m and y=m and z=n} y:=z {x=m and y=n and z=n} -- Axiom
      25. {x=n and y=m} z:=x; x:=y; y:=z {x=m and y=n and z=n} -- rule
      26. {x=n and y=m} z:=x; x:=y; y:=z {x=m and y=n} -- simplifying

        Weakest PreCondition Semantics

        This is (like denotational semantics) an attempt to develop a calculus for handling programming. The key idea is of a "Predicate Transformer" - a function that operates on Predicates to produce predicates. The most important of these is the "weakest precondition" transformer of a statement which tells you the least conditions needed to get a particular result. For example, To be sure that y=n after y:=x we must have that x=n. Further, i x=n then after y:=x we will have y=n. So
      27. wp[ y:=x ] (y=n) iff (x=n). A typical rule is:
      28. if wp[S1]P2 iff P1 and wp[S2]P3 iff P2
      29. then wp[S1;S2]P3 iff P1. (I use 'iff' between equal predicates, so we don't get it confused with '=' between variables and values etc)

        Using these you can show that if we want x=5 after z:=x; x:=y; y:=z then we must have y=5 first, and if y=5 first then x=5 after the sequence.

        Proof 2

      30. wp[y:=z](x=5) iff (x=5) -- axiom
      31. wp[x:=y](x=5) iff (y=5) -- axiom
      32. wp[z:=x](y=5) iff (y=5) -- axiom
      33. wp[x:=y;y:=z](x=5) iff (y=5) -- rule
      34. wp[z:=x; x:=y;y:=z](x=5) iff (y=5) -- rule

      . . . . . . . . . ( end of section Assignment and Sequence) <<Contents | Index>>


      Notice that these methods are not really rivals. They are different techniques for different purposes and (to some extent) apply well to different languages.

      To show this we apply each of them to the same small language (called While) and explore the ease with which different extensions can be specified. Further properties of each kind of semantics for While will be explored and the relationships between the techniques will be developed.

      Next -- A very simple language

      [ while.html ]

    . . . . . . . . . ( end of section Introduction to Semantics of Programming Languages) <<Contents | Index>>


  1. BNF::="Backus-Naur Form", for syntax and grammar, developed by Backus and Naur.
  2. EBNF::="Extended " BNF.
  3. HTML::= "HyperText Markup Language", used on the WWW.
  4. HTML_page::syntax= "<HTML>" head body.
  5. Java::="An " OO " Language from Sun".
  6. LISP::= "LISt Processing Language".
  7. LRM::="Language Reference Manual".
  8. OO::="Object-Oriented".
  9. Prolog::="Programming in Logic".
  10. TBA::="To Be Announced".
  11. UML::="Unified Modeling Language".
  12. URL::=Universal_Resource_Locator,
  13. Universal_Resource_Locator::syntax= protocol ":" location, where
    1. protocol::= "http" | "ftp" | "mailto" | ... ,
    2. location::= O( "//" host) O(pathname).

    (End of Net)
  14. WWW::= See, index to web site for this class.
  15. XBNF::="eXtreme" BNF, developed by the teacher from EBNF, designed to ASCII input of suntax, semantics, and other formal specifications.

Formulae and Definitions in Alphabetical Order