Skip to main contentCal State San Bernardino
>> [CNS] >> [Comp Sci Dept] >> [R J Botting] >> [papers] >> rjb93a.xbnf
[Index] [Contents] [Source] [Notation] [Copyright] [Comment] [Search ]
Wed Dec 20 09:09:28 PST 2006

This part of my site contains partly baked ideas (PBI). It has drafts for publications, essays, plus illustrations and visual aids for presentations and seminars.

Disclaimer. CSUSB and the CS Dept have no responsibility for the content of this page.

Copyright. Richard J. Botting ( Wed Dec 20 09:09:28 PST 2006 ). Permission is granted to quote and use this document as long as the source is acknowledged.


    How Far Can EBNF Stretch?

      .Author Dr. Richard J. Botting <>
      Computer Science Dept, CSUSB
      .Phone (909)-880-5327 .EMail .URL .Presented May 14th 1993



        Extensions to Backus Naur Form have been in use for nearly 30 years. EBNF is one of the few forms of documentation that is both formal and practical. It can be extended much further and becomes a paradigm for the formal documentation of requirements, designs, and specifications of all types of software.



          BNF became Extended BNF and is in all programming language standards and manuals. It is a most successful modern formal notation. This presentation examines how EBNF can be extended to help in other parts of software engineering. I start by listing similar forms that are already in use in non-linguistic domains.

          I define EBNF formally and then to explore the consequences of making small changes to the syntax and semantics of EBNF. This gives rise to a version called XBNF that has proved useful in teaching a programming languages course.

          We can define terms of other types: Sets, functions, relations, types, classes, ... Definitions become a way of naming. Documentation becomes a "first class object" for its users. The theory of the resulting system seems to need recently developed theories of types, semantics, etc[Example: Leeuwen

        1. 90].

          Relation to other work

          This is part of an experiment on inexpensive formal documentation. The working name for the language is MATHS [C Sci Seminar CSUSB 1992]. I have a graphical version called TEMPO [Faculty Research Seminar CSUSB 1992].

        . . . . . . . . . ( end of section Introduction) <<Contents | End>>

        Grammars & Dictionaries


          Bugs occur when ideas are not clearly understood - so good software depends on defining terms correctly and exactly. All projects need a collection of definitions. Call a set of definitions a dictionary . A BNF grammar is such a dictionary. BNF is a syntactically sugared context free grammar. EBNF has the same expressive power as BNF, but is easier to write because it permits regular expressions on the right hand side of a definition: BNF <number>::=<digit>|<number><digit> [Naur et al 64] EBNF number::= digit {digit} [ANSI 83 ADA] The '{...}' indicate 'zero or more of. In EBNF a series of non_terminal are defined in terms of a regular expression that represents a set of strings (a language). However, EBNF-like dictionaries turn up in all parts of software engineering theory and practice, see Table 1 below.

          Table 1. Applications

           Name	Elements	Defined_terms	Some Sources
           Grammar	Tokens|Lexemes	Syntactic units	Naur et al 64,...Ada 93
          .AS_is Dictionary Strings Tokens|lexemes Bell Labs 83(lex)
           Protocol	Messages	Communications protocols
          >As_is Zave 85, Naik & Sarikaya 92
              	Methods 	Objects		Botting 86a
           RFCs	Lines		Email Messages	Internet RFCs 80..93
           Dictiona	Data types 	Data files	De Marco 80, JSP
           StructureData types	I/O Patterns	Warnier/Orr, JSP,
           Path expressions 	Subprogram calls	Paths, S/SL
           	Entries names 	Ada task structureHemendinger 90
           Dynamic Dictionary	Events	Significant patternsKleene 1940s,DAD
           	Commands	System BehaviorsBelli & Grosspietsch 91
           ELH	Events	Life HistoriesRobinson 79, JSD,SSADM
           Logic Grammar	Simple Terms	PredicatesAbramson & Dahl 89
           Program Structure	C Function Calls	C Function DefinitionsBaecker & Marcus 90
           Language Theory	a..z	A..ZEilenberg 74,...THEORY


          The theory of EBNF has been the same for 30 years and is based on defining "string" and "sets of strings." Strings come from an empty string by concatenating symbols. I will show concatenation of strings by an exclamation mark (!). It is assumed to be an associative operation with a unit - the empty string "". Each string, by definition, has a unique sequence of atomic symbols, which when concatenated generate the original string. Temporarily we symbolize strings by lower case variables and sets of strings by upper case variables. Example strings will shown in the C notation (see later if necessary). We next define operations of union(|), and concatenation () on sets of strings by:
        1. A B::={c || c=a ! b and a in A and b in B}, -- the set of all c where c=a!b and a is in a and b is in B.
        2. A | B::={c || c in A or c in B}, A string is treated as a singleton set when it is in a context that needs a set of strings:
        3. For string c, c::sets_of_strings={c}. In this section I stands for the set of the empty string:
        4. I::sets_of_strings={""}. The definition of (#) takes more machinery. We could write,
        5. #A = I | A | A A | A A A | ... . but this formula is not formal enough to be used by a proof checked since ... is not defined. We could define the powers of a set (A^n) and then write an infinite union - but first we would have to define these ideas themselves. So we start from the property
        6. #A = I | A #A and define #A as the smallest solution to the equation X = I | A X:
        7. #A = the smallest { X || X = I | A X }.` Further Kleene showed that by repeatedly applying the operation that replaces X by ( I | A X) we get closer and closer to #A in the obvious sense. So (1) #A is the smallest value of X solving:
        8. X::=I | A X, (2) Repeated substitution generates successive approximations to #A.

          These give a paradigm for defining the meaning of any grammar. For instance, the grammar { L::= a L b | "". } has one definitionL::= a L b | ""., one defined_term "L" and two elements: "a", "b". We define the meaning of "L" to be M as the smallest set of strings such that the following equation is true:

        9. M = a M b | I. So
        10. M= {c || for some x in M (c =a ! x ! b)} | I. The above gives a recursive process testing whether a string lies in M. It does not tell us how to find elements in M. By Kleene's work M is also the smallest fixed-point of the mapping g, where
        11. g(X) = a X b | I. So to find M we use an iterative process to generate the Kleene Sequence:
        12. S(0) = {},
        13. for all natural_numbers n, S(n+1) = g( S(n) ). The series gives successive approximations to M:
        14. S(0)={},
        15. S(1)= I | a{}b = I ,
        16. S(2)= I | a I b = {"","ab"},
        17. S(3)= I | a{"", "ab"}b = {"", "ab", "aabb"},
        18. S(4)= I | a{"", "ab", "aabb"}b = {"", "ab", "aabb", "aaabbb"},
        19. S(5)= I | a{"", "ab", "aabb", "aaabbb"}b = {"", "ab", "aabb", "aaabbb", "aaaabbbb"},
        20. ... It is intuitively obvious that these sets are tending toward a limit of
        21. M = { a^n ! b^n || n=0,1,2,... } Indeed M is a fixed point of g since M = g(M). Any proper subset of M is not a fixed point. The reason that the Ss seem to get closer to M is that as i tends to infinity we have to look at longer and longer strings to find a string in M that is not in S(i). Using a limit to define the meaning of grammar has been around in the literature for a long time (a bibliography is in section 6.7 of chapter 3 of my unpublished monograph). The fixed point definition came later and is similar to Scott's denotational semantics [ACM 86]. Fixed points, iteration, and topological spaces appear to be a vital part of the theory of software.


          XBNF is EBNF with a notation which can be used with other mathematical concepts and so more power. XBNF uses white space, bars ("|") and the symbol
        22. "::=" as in BNF. The "<" and ">" of the original BNF are used as relations and as parts of arrows and crosses. Parentheses "()" are used as they are in algebra. It adds "#" to mean any number of, including zero thus letting the brace symbols of EBNF be used for mathematical expressions of sets. The Ada EBNF symbols "[]" are used to show subscripts and so are not used in XBNF. XBNF also has a standard dictionary that defines some simple functions to be used: Ada EBNF { x } is #( x ) in XBNF and x #(x) is written N( x ) . [x ] in Ada EBNF is O(x ) (optional x ) . Here is a comparison of BNF, Ada EBNF and MATHS: BNF <number>::=<digit>|<number><digit> [Naur et al 64] EBNF number::= digit {digit} [ANSI 83 ADA] XBNF number::= N(digit), where N(x)=x #(x ). [Botting 9x] We take over the notation from section 2.3 above and add intersection (&) and complementation (~) by:
        23. A & B::={c || cA and c B},
        24. A ~ B::={c || c A and not ( c B) }.

          XBNF in XBNF

          This section is an XBNF dictionary describing XBNF:
        25. XBNF_dictionary::= #( XBNF_definition | comment ). A dictionary has any number of definitions and comments.
        26. XBNF_definition::= defined_term "::=" XBNF_expression.
        27. A definition has a "::=" between the term being defined and the expression defining it.
        28. XBNF_expression::=set_expression. In XBNF we are interested in sets of sequences:
        29. set_expression::= item | union | intersection | sequence | phase | element. A set_expression is either an item, or a union, or, ...
        30. item::= element | defined_term | "(" set_expression ")". An item is either an element (defined below), or a defined term, or any set expression in parentheses,
        31. union::= alternative #( "|" alternative ). A union has at least one alternative. Alternatives in a union are separated by vertical bars.
        32. alternative::= complementary_form | sequence. Each alternative is either a complementary_form or a sequence.
        33. sequence::= #phase. A sequence is a number of phases. The "number sign" (ASCII code hex(23), hash, sharp) shows the occurrence of any number of the next item - including none. A sequence can therefore be empty, one phase, or two phases, etc..
        34. phase::= ("#" | ) item. A phase is part of a sequence where an item occurs once, or the same item can occur any number of times including no times at all.
        35. intersection::= item #("&" item ), The ampersand character("&") shows that all the descriptions (items) must hold at once.
        36. complementary_form::= item "~" item, A complementary_form - A~B say - describes the set of As that are not Bs. Here is an example:
        37. comment::= (#character )~definition. Any number characters (in a dictionary ) that are not a definition form a comment. This paragraph is a comment in this XBNF dictionary.

          In a lexical dictionary the following definition of element is used:

        38. element::=character_string | name_of_non_printing_character | function | natural_language_string,
        39. character_string::=quote #( char ~ (quote | backslash) | backslash char) quote,
        40. quote::="\"",
        41. backslash::="\\",
        42. name_of_unprintable_character::=capital_letter #( capital_letter | digit ). The names of unprintable characters of ASCII are well known (NUL,..., SP, CR, LF, ...DEL). In MATHS the character names used in Ada are preferred [silicon:/u/faculty/dick/cs320 /ada/lrm/* and silicon:/u/faculty/dick/cs320/math/comp.ascii].
        43. function::=identifier "(" List( arguments) ")", Simple functions defined by non-recursive definitions do not change the expressive power but makes a practical difference. The following are a standard functions and can used in any lexical or syntactic dictionary:
        44. For all sets_of_strings x,
        45. List(x)= x #( comma x),
        46. non(x)= #char ~ x,
        47. N(x)= x #(x),
        48. O(x)= (x|). Introducing the ability to define and use syntactic functions allows context sensitive features to be handled elegantly and efficiently... See below.

          Another standard kind of element is the identifier. An identifier does not start or end with an underscore - it is made of letters, digits and break characters, and must always start and end with either a letter or digit:

        49. identifier::= ( (letter | digit ) #id_character )
        50. & ( #id_character (letter | digit) ),
        51. id_character::=(letter | digit | underscore). Some identifiers are defined_terms:
        52. defined_term::= natural_identifier & defined |... . When the defined_terms are identifiers and the expressions have no intersections or complements then the dictionary is context free. The three dots (...) indicate that there a defined term can take other forma. A natural identifier is made up of words :
        53. natural_identifier::= word #(underline word)
        54. & (declared_identifier | defined_identifier | unknown_identifier). Words are either numbers or are made of letters. When made of letters they must be in a dictionary - either of a natural language or a special dictionary for the project,
        55. word::=number | ( (letter #letter) & in a dictionary ),
        56. number::=digit #digit,
        57. digit::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9". Rules for creating and using identifiers in general documentation are discussed in my monograph. Meanwhile note that it is common to have several names with one meaning - typically there is a mathematical variable or symbol (like "a", "+", "\λ", ...) which is used in formulae plus a longer natural_identifier .

          Often, a language is defined by two dictionaries - a lexicon that gives names to strings of symbols, and a syntactic dictionary or grammar. The grammar describes the structure of the language in terms of the tokens identified in the lexicon. In this presentation I combined both lexical and syntactic "shells" into a single presentation. Normally they would be split up into separate pieces of documentation - to do this I have experimented with a formalism that treats a dictionary as a module. This is described later. Meanwhile we will assume that the terms defined in the standard lexicon for the ASCII code can be used in future examples: thus terms like comma, r_paren, etc will be used in later syntax definitions. I would suggest that we extend EBNF to also be a glossary. A glossary definition is a definition in the form:

        58. term::=informal description. Syntactically define a glossary as a collection of these:
        59. glossary::=#( glossary_definition | comment )
        60. glossary_definition::= defined_term "::=" natural_language_string punctuation. So in a general piece of documentation a term may have two different definitions:
        61. natural_language_string::= back_quote #(non(back_quote)) back_quote,
        62. natural_language_string::=`A description in a natural language of the meaning and/or purpose of something`,
        63. back_quote::="`".

          A similar three level approach is useful in data processing applications - the analyst preparing a data dictionary that is equivalent to a lexical description of the data, the program designer working out the structures caused by logical and physical constraints, and a team thrashing out the glossary definition. It is possible and useful however to keep all definitions of a single term together rather than in separate files of sections of a document.

          Such a three way approach has been tried out in class. Any user with ~dick/bin in their PATH can query a data base of half-a-dozen language specifications written in XBNF:

        64. define language term -- print out relevant definitions of term
        65. lookup language lexeme -- print out definitions in which lexeme is used. I have invented a notation that allows dictionaries, grammars and glossaries to refer to each other. My define/lookup commands take account of this feature. The effect is that the C++ syntax file inherits definitions from the C syntax plus a glossary of object oriented terms. The syntax inherits the definitions of the ASCII characters and the functions and notation in a meta- definition of EBNF. define always searches a general glossary of computer science. This is all transparent to the user: Example: define c++ class [../c++/syntax] class::=....
        66. ... [./c/syntax] storage_class_specifier::=... . [../glossary.objects] class::=`A set of objects classified by some property or other. See classification`. Notice that the program extracts likely definitions (including mis-spelt terms) and relies on human intelligence to select the best match. I believe is humans being intelligent and machines doing the grunt work!

          Note on Precedence (Optional)

          Because union is defined as made of alternatives which, in turn, contain sequences we have:
        67. |-( a b | c d) = ( (a b) | (c d) ), The number sign ("#") always applies to the immediately following item and means "a number of (including none)". Similarly, (a | #b) indicates either an a or any number of bs, as does (#b|a). This, plus the precedence of spaces over vertical bars means that a #b c | #b describes a string with a single a followed by many bs and one c or else many bs. Parentheses (()) are put around unions within sequences, and iterations.
        68. (a | b) (c | d) = (a c | a d | b c | b d) reads "a sequence of a or b, end of sequence and then a sequence of c or d, end of sequence," Or informally
          "the first item is either an a or a b followed by either a c or a d." Similarly
          #(a | b) =any number of a s and b s in some order.
          #(a b) =any number of pairs of one a followed by one b. In #(a b) each a must be followed by a b. Thus #(a b) is proper subset of #(a | b):
        69. |-#(a b) =>> #(a | b).

          Effect of Intersections

          The theory of defining a language as the intersection of a finite number of context free languages was developed by Liu and Weiner in 1973. For practical examples see: the Turing project [Holt & Cordy 88]. , S/SL, Jackson's work, etc.[Hehner and Silverberg 83]. Warnier's notation permits intersections but they are only used to derive conditions in programs. Intersections let us define languages that are not context free:
        70. L::=P&Q,
        71. P::= #a B, B::= O(b B c),
        72. Q::=A #c, A::=O(a A b). Therefore
        73. |-B={b^m c^m || n=0..},
        74. |-P={a^n b^m c^m || n, m=0..},
        75. |-A={a^n b^n || n=0..},
        76. |-Q={a^n b^n c^m || n, m=0..},
        77. |-L={a^n b^n c^n || n=0..}. The last result shows that a dictionary with an intersection can describe a non-context free language. Since a complement of a union is the intersection of the complements the same effect can be got by permitting complementation and NO intersection. However this seems to be a useless restriction.

          Defining a language as the intersection of several grammatical constraints is neither new nor inefficient - Constrained Expressions are expressive and based on simple regular expressions [Dillon et al 86]. Hehner and Silverberg[83] use the term Communicating Grammars for a set of grammars with some terminals and non-terminals occurring in both. The model here is simpler - I propose the terms concurrent context free grammars (CCFG) for the dictionary model proposed here, and concurrent context free languages(CCFLs) for the family of languages defined by CCFGs. Hemendinger notes that Ada tasks have communication patterns that are CCFLs[Hemendinger 90]. Dictionaries with intersections define languages that are recognized by networks of parallel CFG acceptors. Psychologists use a network of "daemons" for modelling perceptual processes [ Human Information Processing, Lindsay & Norman, Pub Academic Press, 1970]. Finally: the hardest part of Jackson Structured Programming [JSP] is the use of semi-co-routines (in COBOL!) and to handle non-context- free problems by quasi-concurrency[Jackson 1975...].

          Effect of Functions

          Using functions in a grammar has the practical effect of letting us name and use short hand formulae for common constructions such as lists. However they permit the definition of context dependencies in a natural way. First we need a way to define a function and a way to express its image. The natural way to define simple functions is too allow the use a symbol like (_) as the place where the argument of the function must be written:
        78. List::= (_) #( comma (_) ), where List maps a set of strings into another set of strings.. The image of a function F can be written post(F) - the set produced after F is applied to all the elements in it domain
        79. post(F)::= {F(x) || for some x }. For example, if F is a map from the Positive_Integers into sets of strings defined by:
        80. F::=a^(_)b^(_)c^(_), then
        81. post(F) ={a^n b^n c^n || n=0..}. This technique leads to a way to express the context dependencies found programming languages. For example take the labelled loop in Ada. Suppose that loop_with_label is a function the maps identifiers into valid Ada loops:
        82. loop_with_label::= ( (_) ":" "loop" #( statement | "exit" (_) "when" condition ";" ) "end loop" (_) ";". (for example
        83. "SUM: loop x:=x+1/i; exit SUM when x>100; i:=i+1; end loop SUM"
        84. is a string in loop_with_label(SUM). ) To be made precise the statement in the Ada grammar would also need to an argument representing the set of valid labels in internal loops.

          Now, this is a context dependent set of strings and yet the definition shows us a simple and efficient way to parse (and check the syntax) loops: First, use CFG a parser for

        85. loop::= ( label ":" "loop" #( statement | "exit" label "when" condition ";" ) "end loop" label ";". and check for the reoccurrence of label . Clearly functions get us to the level of attribute grammars.


          In XBNF '@' means 'subsets of' so we can define a complicated feature of a language in stages, first declare that it is a subset of a context free language and later provide the specific rules.
        86. Ada_loop::@ (label ":" LOOP"
        87. #(statement|"EXIT"label"WHEN"condition";)
        88. "END LOOP" label ";"
        89. ).

        90. Here the "::" is split of from the "=" because it declares a new identifier but does not define a particular value. This let's us take complex specifications and requirements one simple assumption or condition at a time.

          Combining intersection with functions is interesting. Suppose that we have already defined declaration(V), usage(V), and no_ref(V) as the strings in a language which (1) declare or introduce V, (2) make use of V, and (3) don't refer to V respectively then:

        91. valid_wrt::=#no_ref(_) declaration(_) #(usage(_) | no_ref(_) ), describes all sequences of components where the argument is declared before it is used - and is NOT re-declared. If we now define
        92. valid::= &valid_wrt, where
        93. &F = { y || for all y (y in F(x) },
        94. then
        95. block::="begin" ( (statement # (";" statement )) & valid ) "end", defines a block with all its variables declared exactly once before they are used. This is a common context sensitivity in programming languages. Finally the '&'s in the definition imply that the parser would be structured as a number of parallel parsers, all with the same structure. This form concurrency is very easy to handle in both theory [Hoare 79 & 85] and in practice [Jackson 75, Andre Banatre & Routeau 81, Cameron 89].

          Definitions that include parameters are useful for defining and specifying maps and relations. Sets of definitions that include parameters are called Logical Grammars [ Abramson & Dahl 89 ]. A more general form of dictionary is the W-Grammar, or Two-level Grammar or Van-Wijngarten Grammar that contributed to the demise of Algol 68 yet is alive as a tool for documenting data structures[Gonnet & Tompa, Gonnet & Baeza-Yates 91]. The effect is that of a logic programming language [Ballance et al 90].

        . . . . . . . . . ( end of section Grammars and Dictionaries) <<Contents | End>>


          Dictionaries in Abstract

          Abstract models are easier to re-use than concrete ones. Any system that fits the theory of strings plus regular sets(A semiring with Kleene closure ) can use a similar kind of dictionary to those described above. This part of my presentation is a quick tour of such areas.


          Abstract models are easier to re-use than concrete ones. Any system that fits the theory of strings (a monoid) plus regular sets(A boolean algebra) with kleene closure can use dictionaries. The grammar can be put to work to describe files in terms of data in stead defining languages in terms of characters. The equivalence of a lexical dictionary is the analysts "data dictionary". The counterpart of the syntactic grammar is a Jackson or Warnier "data structure". All we need is an empty object and a concatenation operator. For example if T stood for a set of records and #T for files, then the empty string models the empty file (UNIX /dev/null) and concatenation models writing one file after another (UNIX: cat a b >c).


        1. magnetic_tape::=label #file end_of_tape_marker,
        2. file::=header #data_block sentinel.

          These define the magnetic tape's physical structure. If we were designing a program to process a file of student records we might describe a file sorted by sections as follows:

        3. file::=#section,
        4. section::=class_header #student,
        5. student::=A_student | student~A_student.

          Strictly speaking a definition of form A::= B | A~B is a theorem `|- A = B | A~B`.

          If the domain has the words "file", "section" and "student" we use a similar structure, even if the class header is absent and a student record identifies the student's class instead. The result is called a logical structure.

        6. file::=#section,
        7. section::=#student,
        8. student::=A_student | student~A_student.

          If the problem is to summarize the data on each section then we might show the output as:

        9. output::= header_page #section_summary college_summary, and note that each section corresponds to exactly one section_summary and the input corresponds to the output. It is easy to combine these into one abstract structure:
        10. consume_file_and_produce_output::= produce_header_page # consume_section_and_produce_section_summary produce_college_summary.
        11. consume_section_and_produce_section_summary::= # consume_student
        12. consume_student::=consume_A_student | consume_non_A_student. The structure above becomes a program when operations and conditions are added (see my monograph or any good JSP book).

          A data dictionary is free of semantic and pragmatic structures because a file can be used for many purposes and hence is parsed into different structures. Typically:

        13. free_file::=# record,
        14. record::=type1 | type2 | type3 | ...,

          Now if a,b,c,... are elements then

        15. F::=#(a | b | c | ... ),

          is the free semigroup or unstructured structure. In categorical terminology this is a free, initial, or universal repelling object in the category of semigroups with units. Free dictionaries are the weakest useful dictionaries. In consequence a design that has two or processes that communicate by consuming and producing strings(files) can interpret or structure the files as they wish. This is the powerful form of information hiding that is implicit in Jackson's work.

          Data dictionaries can show a non-programmer what a program is expected to do in their own terminology not computer jargon. The canonical grammar/dictionary /glossary shows the user or client that we understand all their terminology. Canonical diagrams and dictionaries show how the user's data is handled in terms that the users and clients understand. The test of a canonical dictionary is that clients and users certify it as complete and accurate. Canonical dictionaries are the strongest useful dictionaries.


        16. Functions (in the pure mathematical sense) are a part of most software systems. Because they are there we need some simple conventions that let them be defined very easily. The result is close to the MIRANDA language developed in the United Kingdom[ ??] . The key idea is a simplification of the λ notation. In the λ notation a variable is introduced to indicate the value to which the function is to be applied. For example if we use the key- word 'map' to indicate such an expression we might write:
        17. double::= map[x] (x+x).
        18. square::= map[x] (x*x).
        19. cube::=map[x] (x*square(x)).
        20. .... These imply
        21. cube(double(square(2)))=cube(double(4))=cube(8)=8*square(8)=8*(8*8)=8*6 4=512. However, simple maps can be written with a constant blank space symbol (_) instead of the variable x:
        22. double::= (_)+(_).
        23. square::= (_)*(_)
        24. cube::=(_)*square((_)). As an immediate application are the STANDARD definitions of the syntax functions mentioned in the previous section:
        25. #::=( (_) #((_)) |).
        26. L::= (_) #( comma (_)), L::=List of (_).
        27. P::= l_paren L((_)) r_paren, P::=Paranthesized list of (_).
        28. non::= char ~ (_).
        29. N::= (_) #((_)), N::=Many (_).
        30. O::= ((_)|), O::=Optional (_).

          So if we wrote

        31. some_map::=(_), then some_map(x) = x. So, (_) must represent the identity map. Extend this line of thought further. Putting any function in an expression where a value is expected and we get an expression for a function instead of an expression of a value. So if g maps into the domain of f then
        32. f(g) = map[x](f(g(x)). If 1st, 2nd, 3rd, 4th, ... ... are maps from n-tpls into their components then
        33. power::= (1st)^(2nd), implies that
        34. power(3,2) = (1st(3,2))^(2nd(3,2)) = 3^2 = 9.

          The result is a notation like the command line arguments in a C program, a UNIX shell script, ICL George 3 Macros, ... The same simple approach, starting with simple operations like successor (succ), predecessor(pred), and a constant zero(0) lets us develop the Peano arithmetic in EBNF_like definitions:

        35. +::= (2nd)=0; (1st) |(2nd)<>0; succ((1st))+pred((2nd)).
        36. *::= (2nd)=0; 0 | (2nd)=1; (1st) |(2nd)>1; (1st)+((1st)*pred(2nd)).
        37. ^::= (2nd)=0; 1 | (2nd)=1; (1st) |(2nd)>1; (1st)*((1st)^pred(2nd)).

          Here the semicolon acts like "if_then_": (;)=if (1st) then (2nd) but is actually the relational product described next.

          I have shown that BNF definitions can define functions in a natural way.

          Further if 1st and rest map a string into two parts so that x=1st(x)!rest(x), then

        38. reverse::=""+>"" | (strings~""); ( reverse((rest)) ! (1st) ).

          Binary Relations

          A relation R in @(T1,T2) can be defined in terms of a well formed formula W(x,y) by
        39. R::@(T1,T2)= W((1st), (2nd)). or any of the following forms
        40. For x:T1, y:T2, x R y::@= W(x,y). Given two binary relations R and S then For R: @(T1,T2), S:@(T2,T3),
        41. R; S::= for some z:T2 ((1st) R z and z S (2nd)),
        42. R; S::= The relation between x and y such that for some z, x R z S y,

        43. R | S::= (1st)R (2nd) or (1st) S (2nd),
        44. R | S::= `The relation between x and y such that either x R y or x S y`,

        45. R & S::= (1st)R (2nd) and (1st) S (2nd),
        46. R & S::= The relation between x and y such that both x R y and x S y.

        47. R ~ S::= (1st)R (2nd) and not (1st) S (2nd).
        48. R ~ S::= `The relation between x and y such that both x R y and not x S y`.

          /R is the converse relation to R. For example (_^2) is the 'square' function and /(_^2) is the square root relation, Similarly sin is the sin function and /sin includes the arcsin function. Further we want /parent_of=child_of. Therefore define:

        49. For R:@(T1,T2), /R::@(T2,T1)= (2nd) R (1st).

          In context, a set can also be used as a relation:

        50. For A:@T, A::@(T,T)=(1st) in A and (1st)=(2nd).

          Complex situations can be defined quickly using these notations in and EBNF- like way: Suppose that we have sets male and female, and a relation parent:

        51. parent::=(1st)is a parent of(2nd).
        52. ...
        53. child_of::=/parent.
        54. father::=male;parent.
        55. mother::=female;parent.
        56. son::=male;child_of.
        57. daughter::=female;child_of.
        58. grandfather::=father;parent.
        59. ...
        60. granddaughter::=daughter;child_of.
        61. grandson::=son;child_of.
        62. ...
        63. sibling::= (/parent;parent) ~ (=). -- I am not my own sibling!
        64. sister::=female;sibling.
        65. ...
        66. uncle::=brother;parent.
        67. ...
        68. niece::=daughter;sibling.
        69. first_cousin::=/parent;sibling;parent.
        70. second_cousin::=/parent;/parent;sibling;parent;parent.
        71. lineal_descendant::= child_of | lineal_descendent; child_of,
        72. ancestor::=parent | parent; ancestor
        73. cousin::=/parent;sibling;parent | /parent;cousin;parent,
        74. first_cousin_once_removed::=/parent;first_cousin | first_cousin;parent.
        75. [From Logic.Natural.KINSHIP]
        76. ...

          The last four definitions above are like productions in context free grammar. The definitions of ancestor and lineal_descendant illustrate the transitive closure or ancestral relation. It is so common that a special notation is defined:

        77. do(R)::= `The relation between x and y such that x=y, or x R y or x R;R y, or...`.

          Example:|-ancestor=parent; do(parent) and

        78. lineal_descendant=child_of; do(child_of).

          The defining property of do is that if something is invariant (or fixed, or closed) under R then it will not be changed by any number of Rs. This was originally in Whitehead and Russell's Principia the same form is in Floyd's loop invariant technique and reminiscent of Kleene closure in grammar theory. This definition relies on using higher order quantifiers[cf Guerevich 87].

        79. inv(R)::= the invariants of R
        80. inv::=the sets X such that if x in X and x (_) y then y in X. Example: families::=inv(parent).
        81. For R:@(X,X), F:@X, if F.R in @F then F in inv(R).
        82. For R:@(X,Y), F:inv(R), in @F.
        83. Example: in @families.

          Is a subset of">

          Note Is a subset of

          Normally X in @Y is written X==>Y and means that X is a subset of Y
        84. or is Y itself.
        85. For X,Y:Sets, X==>Y iff for all x:X ( x in Y). [ logic_30_Sets.html ]

          Life Histories

          EBNF is a usable context free grammar that uses notation equivalent to Kleene's Regular Expressions. Kleene developed Regular expressions as a way for documenting Events in Neural Networks. They were put to good use in both automata theory and grammar theory in 1950's and 1960's. What is not well known is that they have also been in use (from about 1975 onward) to document that possible patterns of events that effect an entity in a system [Kleene,DAD, Robinson 79, JSD,SSADM]. Since 1980 the behavior of communicating sequential processes has been a hot topic for research [ Hoare, Milner,....] . The formulations always seem to involve an XBNF like component to describe the behavior of objects and entities.

          Here is simple example: A button for an elevator can be pushed many times before an elevator finally services it:

        86. button::= # request. request::= pushed #pushed serviced.

          Another example: A stack accepts items that are pushed onto it and requests for items to be popped off it. When unempty it makes sense to request a copy of the top item one or more times.

        87. stack::= empty | push non_empty_stack pop.
        88. non_empty_stack::= #top | push non_empty_stack pop.

          Jackson has shown how such formula lead directly to the designs of software that behaves as specified[JSD, SSADM,...].


          Relations behave like the operations and conditions in a programming language because each operation in a program relates the current state to the set of possible future states. If there was always a single next state then the relation is deterministic, and by composing deterministic steps carefully we get an algorithm. In turn an algorithm can be coded for input to a suitable translator giving an executable program. Thus a relation describes a set of acceptable behaviors and the implementor selects a particular algorithm [compare with Kowalski, Hoare 87].

          MATHS relations can be written in a form that makes them look like statements in a programming language. Some syntactic sugar is defined for those who want to write while/for/loops/. . . . Here is an introduction. Formal treatment comes later when we have the formulae to handle it [Chapters 5].

          The key definition is that of a dynamic predicate. A predicate is a proposition that has free variables. A dynamic predicate has free variables with a prime (') attached to them. Here is an example:

        89. x'=x+y.

          After the statement, the value of x is equal to the sum of the old values of x and y. y is unchanged. It could be coded as x:=x+y in Ada/Pascal/... The rule is that "a prime means the new and no prime means the old" [Ross Ashby 56, Part II of Lam & Shankar 90, compare Hehner 84a & 84b, Guttag & Horning 91]. Formally x'=x+y defines a relation on a structured set with variables x and y.

        90. x'=1

          guarantees that x is 1 after the operation but no other variables can change during the operation.

        91. x'=1-x

          means that the value of x will be complemented - its new value will be the result of subtracting its old value from one while all other variables stay the same.

        92. x=1

          has no primed variable. It means that x has value one, and does not change - all variables are left untouched. It is a relation that holds between identical states when and only when x has value 1. It filters out all states when x is not 1. It allows computation under a condition. We call it a "Condition" therefore. It can be shown that it is a condition in the sense of Arbib and Manes [87].

          Here is a more general statement of the MATHS interpretation of a predicate that contains variables with primes. Consider a simple state space S which declares m variables x1, x2,...x[m] of types X1, X2, ...X[M]:

        93. S::=${x1::X1,x2::X2,...,x[m]::X[m]}.

          Let P(y1',y2',y3',...y[n]',x1,x2,x3,...x[m]) be a predicate with n distinct primed variables(y1, y2,...y[n]) which also may also appear without primes as xs. For example in x'=x+y, x1=x, x2=y, and y1=x.

          The relation defined by P(y1', y2', y3', ...y[n]', x1, x2, x3, ...x[m]) is found by:

        94. (1) Replacing primed variables (y1,...) by values in the new state(s2), and
        95. (2) Replacing non_primed variables(x1,...) by values in the old state(s1), and
        96. (3) Adding the requirement that the static (unprimed) variables do not change. So:
        97. P(y1',y2',y3',...y[n]',x1,x2,x3,...x[m])::=
        98. rel [s1:S, s2:S](
        99. P(s2.y1, s2.y2, ..., s2.y[n], s1.x1, s1.x2, ...,s1.x[m]
        100. and for all i:1..m where x[i] not in {y1, y2, ..., y[n]},
        101. s2.x[i]=s1.x[i]
        102. ).

          A more rigorous definition will be worked out in my monograph meanwhile here are some examples. To specify a process that solves two simultaneous linear equations we write

        103. a*x'+b*y'=c and d*x'+e*y'=f.

          To specify a process that solves a quadratic equation we write

        104. a*x'^2+b*x'+c=0.

          On the other hand

        105. a'*x^2+b'*x+c'=0,

          specifies that x is left alone and a, b, and c are adjusted to fit the equation.

          Program Structures

          Consider an expression that uses (do), (|), (;) from the calculus of relations to structure dynamic predicates, then it specifies, at a high level a program. For example:
        106. SUM::=(n'=1 and s'=0; do(n<>101; s'=s+n and n'=n+1); n=101).

          The definition for SUM looks like a program but is an expression in the calculus of relations and so has provable properties. First since the (+) is defined to be an operator on numbers, so the state space is ${n,s::Numbers}. Next define names for the parts of SUM:

        107. S::=(n'=1 and s'=0).
        108. L::=(s'=s+n and n'=n+1).
        109. E::=(n=101). W::=not E.
        110. |-SUM= (S; do(W; L); E).

          Now after E we know that n=101 (and s has its previous value), since

        111. post(E)=img(E)={ x :X || x.n=101}.

          Similarly after S we know that n=1 and s=0. Further not E by definition will not change the values of s and n. Next we look for something that does not change in L. Both s and n change - in parallel:

        112. |-L = ( (s',n')=(s+n,n+1) ). With a little creativity or experience we discover the following invariant of L:
        113. F::= ( s=1+2+3+4+...+(n-1)).
        114. |-F; L ==> F,
        115. |-F in inv(L).

          Similarly|-F in inv(W) and inv(E),
          so|-F in inv(W;L).
          Thus|-(F; do(W; L)) ==> F.

          Now|-S ==> F and F;E==>F.

        116. so
        117. |-SUM = (S; do(W; L); E) ==> F and n'=101

          Hence, after SUM, s= 1+2+3+4+...+(n-1) where n=101 = 5050.

          Notice that MATHS is not a programming language. So

        118. x (do(W; L)) y is not the same as
        119. x (do(W; L); E) y. in the first case y.n can have many values. In the second y.n will be 101. Similarly, when
        120. x ( i=1; i'=2 | i'=3 ) y

          permits y.i=2 even if i<>1, but if x.i=1 then y.i MUST be 2. Whatever happens y.i is either 2 or 3. Such non-determinism seems to be essential for a simple model of processes[Dijkstra 76, Hoare 78, Hoare 85, Gabrielian & Franklin 91].

          MATHS includes structured programs because For R:@(S,S),

        121. no(R)::= ((1st)=(2nd) and for no y((1st) R y) ::=The relation between x and x such that for no y does x R y
        122. For A:@S, R:@(S,S), while(A)(R)::=(do(A; R); no(A)),
        123. For I,J,K:@(S,S),R:@(S,S), for(I, J, K)(R)::=(I; while(J) ( (R); K ) ),
        124. For A:@S, R,T:@(S,S), (if A then R else T fi)::=(A; R | no A; T).

          We can, for example, when n<=m define:

        125. SUM(f, i, n, m, s)::=for(i'=n, i<=m, i'=i+1)(s'=s+f(i))
        126. and show
        127. SUM(f, i, n, m+1, s) = SUM(f, i, n, m, t) and s=t+f(m+1).

          Using Chapter 4.Section 8,

        128. SUM(f, i, n, m, s)=for(i'=n, i<=m, i:+1 )( s:+f(i) )

          MATHS even has an extension of Dijkstra's Guarded Command. Any group of MATHS users can document and then use its own set of abbreviations. For example:

        129. ADA_PDL::=
          1. For A,B,C:@(S,S), loop A; exit when B; C; end loop::= (do ( A; no(B); C); (A; B) ).
          2. ...


          MATHS can express mixtures of conditions and actions in one relation. A well known enigmatic program is often written:

        130. Enigma::= ( while(n>1) (
        131. if n in even then n'=n/2
        132. else n'=3n+1
        133. fi
        134. ) ).

          In MATHS it can be transformed to read:

        135. |-Enigma=while(n>1)( 2*n'=n | 2*n'=3*n+1 ).

          MATHS makes the derivation of programs more logical: (1) state the properties of the changed variables as a set of equations and then (2) solve for the primed variables. Similarly, (1) give non-deterministic relationships and reduces them to a form that is compatible with a declarative programming system such as Prolog [Compare Lamport 87].

          Finally, MATHS relational expressions follow the calculus of relations[Whitehead & Russell]which can be used to simplify sequences into disjunctive normal form, like the following:

        136. R = A1 and A2 and A3 and... or B1 and B2 and... or... Typical rules are
        137. |-R;(S|T) = (R;S) | (R;T)
        138. |-(P(x, y); Q(x, y, y')) = (P(x, y) and Q(x, y, y'))
        139. |-(Q(x, y); x'=e(x, y) ) = (x'=e(x, y) and Q(x , y))
        140. |-(x'=e(x, y); Q(x, y, y')) = (x'=e(x, y) and Q(e(x, y), y, y'))


          There is no advantage in using mathematics only as a program design language. The big pay-offs will occur by introducing formal methods where the errors are made - early in the process. Most errors are made in the analysis of requirements - misunderstanding, ignorance, ambiguity, and jumping to conclusions. Mathematics is best used recording ideas and defining terms throughout a project. For example it is not useful to a client to just quote specification and algorithm:

        141. euclid is a program that calculates the greatest common denominator(g):
        142. euclid(x,y,g)::=(x=y=g' | x<y; euclid(y,x,g') | x>y; euclid(x-y,y,g') ).
        143. [transcribed from a Prolog text - Schnupp & Bernhard 87, p36]

          Shorter forms are worse [Knuth et al 89]:

        144. gcd::=map[x,y] (x=y+>x | x<y +> gcd(x,y) | x>y +> gcd(x-y, y)). The better way is to define why and then derive what to do from the Why. Z lets you gather a set of declarations and axioms into a schema. I have experimented with using an EBNF like definition to name a set of declarations, definitions, comments, and assertions. The documentation is put between matching "Net{" and "}" strings. The result can be named and so reused:
        145. GCD::=
            Theory of the greatest common denominator of a set of positive integers.
          1. Uses Number_Theory.
          2. |-Nat::=Natural. |-Nat0::=Positive. |-Nat::=Nat|{0}. -- Definitions in Number_Theory |-For x,y :Nat0, x divides y ::= for some f:Nat0(x*f=y),
          3. factors(y)::= {f:Nat0 || f divides y},
          4. multiples(x)::={y:Nat0 || x divides y}. -- Definitions in Number_Theory
          5. ...
          6. |-(1) for x,y,z,a,b:Nat,
          7. if x and y divides z then a*x+b*y divides z.
          8. ... (def gcd) gcd::nAT><Nat->Nat=greatest(factors(1st) & factors(2nd)),
          9. ...
          10. |-for x,y, gcd(x,y)=gcd(y,x) { ...(&) in commutative(@Nat0) ...}.
          11. |-for x,y, if x>y then gcd(x,y)=gcd(x-y,y).
          12. ...
          13. euclid::= (do( x<y; x'=y and y'=x | x>y; x'=x-y ); x=y=g').

          14. |-for x,y,g:Nat, (x=>x,y=>y,g=>g) euclid (x=>x', y=>y', g=>gcd(x,y)).
          15. ... }=::GCD

            GCD can then be referred to by Uses GCD in other pieces of documentation.

            An abstract specification permits innovative but correct implementations to be invented. For example, if one has to evaluate a many GCD's then Euclid's algorithm is clearly inefficient, and remains so when coded iteratively. For example, when gathering statistics on the GCD's of all pairs of numbers in the range 1..2^p as p+>oo I found that the iterative version of Euclid's algorithm recalculates the same values many times. Backtracking to the definition in above (GCD(def gcd) ) we have: (GCD(def gcd) )|- For x,y, gcd(x,y)::=greatest(factors(x) & factors(y)).

            Therefore: gcd in (Nat >< Nat)->Nat. Now ((Nat >< Nat)->Nat) is monomorphic to @(Nat, Nat, Nat). So gcd can be a table, a data base, a data structure, or even a hybrid of data and algorithm:

          16. With T::@(Nat, Nat, Nat) define
          17. euclid_plus::= ( x0'=x; y0'=y;
          18. do( no (x, y, g') in T;
          19. ( x<y; x'=y and y'=x
          20. | x>y; x'=x-y
          21. )
          22. );
          23. ( x=y; g'=x; T:|{(x0, y0, g)} --(put gcd in T)
          24. | (x,y,g') in T --(find gcd in T)
          25. ) ) The form of euclid_plus is
          26. LOOP::=do ( no A ; ( x<y; b | x>y; c) ); ( A | x=y; d ) where A=((x,y,g') in T), and b, c, d are all total functions. So
          27. |-LOOP=do ( no A ; (x<>y); ( x<y; b | x>y; c) ); ( A | x=y; d )
          28. =do ( (x<>y) and no A ; ( x<y; b | x>y; c) ); ( A | x=y; d ). So LOOP is a Zahn loop [Zahn 69]. So if`T ==>{ (x, y, gcd(x,y) ) || x,y::Nat}` is true before each call then it is also true after the call. It is an invariant of Euclid_plus. Consider T={} before euclid_plus:
          29. T={}; euclid_plus =( x0'=x; y0'=y; do( x<y; x'=y and y'=x | x>y; x'=x- y ); x=y; g'=x; T:|{(x0, y0, g)} )) and so ends up with (x0,y0,gcd(x0,y0)) being placed in T. By induction, euclid_plus works when T is empty initially and encapsulated. When Euclid_plus is coded, a representation of T will be chosen and a limit will be put on the size of T. However Euclid_plus works if `T:|{(x0, y0, g)} is changed to T'==>T|{(x0, y0, g)}`. Code is often a finite approximation to a design. Other ADTs, classes, and objects have defined.

            Requirements and Domain Analysis

            XBNF can be extended or embedded into an ASCII based language that I have called MATHS. MATHS lets terms of any type to be defined - bits, numbers, functions, specifications of programs, ... . To make this work we need an extended theory of the meaning of a set of definitions. MATHS assumes that each definition adds a variable and also adds an axiom:

          30. t::= e means
          31. t::type(e). t=e. MATHS use the Pascal/Ada syntax:
          32. term "::" Type. Therefore we will need develop a meaning for such declarations.

            In general I have been studying packages of declarations and well_formed_formula. Syntactically:

          33. package::="Net{" #( #(label) (declaration | definition | well_formed_formula| ...|comment ) ) "}" For example:
              0::S. +::S><S->S. For x,y,z:S(x+(y+z)=(x+y)+z). For all x:S( x+0=0+x=x). }

              Nets are collections of assumptions not programs. Associated with a set of declarations is an abstract space. A declaration adds a dimension to the space and an equality removes a dimension. In consequence a definition that does not involve recursion leaves the system unchanged. Some recursive forms invalidate their universe but others(for example LISP) has a BCNF which is not recursive form.

              MATHS borrows an idea from the Ada package and the Z schema: networks can be named.. The original BNF notation is appropriate:

            1. NAME::=package. For example:
            2. MONOID::=
              1. 0::S.
              2. +::S><S->S.
              3. (assoc): For x,y,z:S(x+(y+z)=(x+y)+z).
              4. (zero): For all x:S( x+0=0+x=x).
              5. }.

                Nets are not interpreted as logic program. In many ways they can thought to define algebraic systems rather than a program,... indeed in most of these system the Unification Algorithm is not a valid deduction. MATHS provides the ability to show that the uncertain statements must follow from less doubtful ones. A special sign (|-) shows that the formula can be derived from previous formulae. A complete inference engine or automatic theorem prover will be an uneconomic investment since the engineer can work interactively with a simpler proof checker. A proof typically has a Natural Deduction form like this:

              6. MONOID.
              7. |-For all z:S, if for all x:S(x+z=x) then z=0
              8. Let{ (1) z:S. (2) for all x:S(x+z=x).
              9. (2, x=>0)|-(3) 0+z=0,
              10. (zero)|-4) 0+z=z
              11. (3, 4, Euclid)|-z=0.
              12. }. I have developed a calculus for constructing packages of documentation. They can be used in many ways - sets of object, proofs, data description, logistic systems, notes, ...

                Thus EBNF has become a way to document mathematical and logical systems:

              13. PEANO::=
                1. Natural_numbers::Sets,
                2. succ::Natural_numbers->Natural_Numbers,
                3. 0::Natural_numbers.
                4. pred::(Natural_numbers~{0})->Natural_Numbers,

                5. (Axiom 1): pred(succ)=(_).
                6. (Axiom 2): For all n:Natural_numbers~{0} ( succ(pred(n)) = n ).
                7. (Axiom 3): For no n:Natural_numbers ( succ(n)=0 ).
                8. ...
                9. 1::= succ(0).
                10. <::= ((1st)=0) & ~( (2nd)= 0) | pred(1st)<pred(2nd).
                11. |-ORDER( Set=> Natural_numbers, Relation=>(<) ).
                12. >::= (2nd)<(1st).
                13. <=::= <|=.
                14. >=::= >|=.
                15. ...
                16. (Axiom 4): For all S:@Natural_numbers, one n:S, all m:S ( n <= m).
                17. ...
                18. +::= (2nd)=0; (1st) | ~((2nd)=0); succ((1st))+pred((2nd)).
                19. *::= (2nd)=0; 0 | (2nd)=1; (1st) | (2nd)>1; (1st)+((1st)*pred(2nd)).
                20. ^::= (2nd)=0; 1 | (2nd)=1; (1st) | (2nd)>1; (1st)*((1st)^pred(2nd)).
                21. ...
                Once named and filed these systems can be referred to in other documents. For example, a group inherits the structure of a semigroup or monoid:
              14. GROUP::=
                1. (monoid): MONOID.
                2. m::=Inversion operator::S->S.
                3. (inverse): For all x:S( x+m(x)=0 ).


                The effect is to formalize what mathematicians have always done and simultaneously introduce inheritance into computer documentation. I have converted most of my notes on algebra and topology into this form on a 7K floppy disk.

                Here is an example of a practical application based on a recent publication: [Snyder 93, Alan Snyder<>, The Essence of Objects: Concepts and Terms, IEEE Software V10n1(Jan 93)pp31-42]

              15. OBJECTS_0::=
                  All objects embody an abstraction. Objects provide services. Clients issue requests. Objects are encapsulated. Requests can identify objects. New objects can be created. Operations can be generic. Objects can be classified in terms of their services. Objects can have common implementations. Objects can share partial implementations.


                We proceed by abstracting noun phrases from these sentences and replacing them by sets and relations:

              16. OBJECTS_1::=
                  Objects, Abstractions, Services:: Sets.
                1. embody::Objects>->Abstractions.
                2. provides::@(Objects, Services),
                  provides=(1st) provides (2nd). ...


              . . . . . . . . . ( end of section Applications) <<Contents | End>>


              By extending EBNF to XBNF and then to MATHS we get a powerful way to document the domain in which software is or will be operating. "An application domain system model of the equipment and its intended environment of use is defined. This is a dynamic system model, and defines the overall states of the system as functions of time. Requirements are constraints on this model" [Hansen Ravn & Rischel 91 ]. Simple mathematics - sets, mappings, relations, definitions, assertions, etc.-

            3. can define important ideas. Some details are in the Annex.

              MATHS is more abstract than any other computer language. It is designed to be too powerful for any conceivable computer to use as a programming language because the language used at the sharp end of a project must be able to define problems well enough so that we can recognize that they are (1) hard, or (2) not computable. Problems of efficiency, computability, data vs algorithms, etc. are handled as part of the implementation. The MATHS description of the problem and the specification of a requirement should be independent of all implementations and technology. Ignoring efficiency etc. is difficult at first. It is essential to describe the purpose and qualities of something before choosing or testing techniques. Hence:

            4. Formulae are descriptive, clear, and correct. Efficiency comes later.
            5. Formulae specify results and properties but do not over determine an algorithm or data structure.
            6. Some formula look like programs but they often take intelligence to interpret and/or code them.
            7. Some formula can be implemented as data or program.

              MATHS & XBNF help solve several outstanding problems in the software process. First, we can show how the user's data is handled in terms that they understand. We can guarantee that this understanding is accurately modelled in the software by using rigorous design techniques. Second, Strachey pointed out the need to discover data structures before attempting functional design [Strachey 66]. The sets and maps found by formal analysis connect facts to conceptual data structures. Standard data base and data structures techniques can then implement the conceptual solution to the problem.

              Projects that use MATHS can "zero-in" on the best software. First: Define things that appear in the description of the problem and (1) are outside the software, (2) interact with the software, and (3) are individually identifiable by the software. Second: Define events, entities, identifiers, relationships, attributes, facts, and patterns. These steps are an exercise in stating the obvious. Facts are recorded and verified by experts and by observation. Simple facts lead to validation of algorithms, data structures, and data bases.

              If a system interacts with another system then it will have an internal model of the external system. An abstract model of the external system is a model for the internal system and so a structure for a design. Computer Science already teaches the techniques to implement the designs plus the calculus of sets, maps, and relations. So software design can be based on science. There is also a natural graphic representation (TEMPO). This is a basis for Software Engineering.

              Annex3 Appendix


                  Place in the Software Process
                  This summarizes the notation (MATHS) I have developed for encoding discrete mathematics in ASCII so that software engineers don't have to make an investment in new hardware and software to do formal methods. It was developed to fit with notations that are already in use and to take advantage of programming language knowhow whenever possible. A formal definition has been developed and has been submitted to a series of publishers. I keep an online summary, dictionary etc in a directory called "math" in my home directory (~dick/math or ~rbotting/math).

                  Formal Analysis proceeds by stating the obvious and so certain truths. Simple facts often delimit the set of logically correct structures. Grammars and dictionaries are used to document the data and dynamics of a domain. Relations can specify processes in a structured dictionary. Sets and maps define entity types, abstract data types, and classes of objects of the problem domain and are used to document requirements and specifications. When implemented, maps become paths through the data structures and data bases. So they define the structure shared by solution and problem.

                  Project documentation also describes the physical constraints caused by the technology used. Again there are two extremes: free and inconsistent. Some where between them is the best level. This records every assumption that can be made about the implementation technology. It is the canonical documentation of the technology being used when the software engineers are its clients and users.

                  Levels of Documentation
                  Some documentation says too little, and some says too much. Intuitively we have a category of logical systems. In this category of documentation we can distinguish two extremes. There is a free, initial, or universal repelling object in the category of documented objects. Free or initial documentation is the weakest useful documentation. At the other extreme there are final, inconsistent, or universally attractive documentation - this overdefines the situation to the point where it can not exist. In between there is the level of documentation that reflects exactly what the client understands the situation to be. I call this the canonical documentation. It forms the cannon or Bible for the project. It shows the user or client that we understand all their terminology. Canonical documentation shows the user's world in terms that the users and clients understand. The test of canonical documentation is that clients and users certify it as complete and accurate. Canonical documents are the strongest useful conceptual documentation.

                  This annex summarizes the notation I have developed to express discrete mathematics in ASCII.

                . . . . . . . . . ( end of section Introduction) <<Contents | End>>

                Propositions and Predicates

                A proposition can have two possible meanings - true and false. In MATHS the symbol @ represents the Boolean type with two standard values {true, false}. The following table defines the operators.

                P true false Syntax Type not P false true prefix @->@

                P true true false false Q true false true false Syntax Type P and Q true false false false infix(serial) @><@->@ P or Q true true true false infix(serial) @><@->@ P iff Q true false false true infix(parallel) @><@->@ if P then Q true false true true natural @><@->@ Unlike most programming languages, mathematics has two kinds of infix operator:

              1. Serial: P1 and P2 and P3 = (P1 and P2) and P3
              2. Parallel: P1 iff P2 iff P3 = (P1 iff P2) and (P2 iff P3).

                A predicate is a proposition that can have variables and quantifiers. MATHS spells quantifiers out and includes numerical quantifiers like "one", "no", "0..1" as well as "all" and "some". The syntax attempts to make the writing of formal statements as natural as possible. In the following X is a set or a type, xis a variable and W(x) a proposition containing x in a number of places. for all x:X(W(x))::=For all x of type X, W(x) is true, for x:X(W(x))::= for all x:X(W(x)), for no x:X(W(x))::=for all x:X(not W(x)), for some x:X(W(x))::=not for no x:X(W(x)). for 0 x:X(W(x))::= for no x:X(W(x) ), for 1 x:X(W(x))::= for some x:X( W(x) and for 0 y:X(x<>y and W(x)) ), for 2 x:X(W(x))::= for some x:X( W(x) and for 1 y:X(x<>y and W(y)) ). for 3 x:X(W(x))::= for some x:X( W(x) and for 2 y:X(x<>y and W(y)) ).

              3. . . . for 0..1 x:X(W(x))::=for all y,z:X( if W(y) and W(z) then y=z),
              4. . . . Equality is a special predicate that is defined for all universes of discourse. It is symbolized by "=". "<>" is (as in Pascal and BASIC) used for inequality. Equality is a serial operator on objects in each universe. A type is defined to a maximal collection of objects for which the lower predicate calculus with equality can be applied.


                For any type T there is an another type symbolized by @T with objects that are sets of objects of type T. In MATHS [unlike Pascal but as in Ada] a subset of a type is not a type, but a set. Each set is made of elements of one type and there type is determined by the type that the set does belong to. If a set A is of type @T then As elements must be of type T. It is therefore unambiguous to use sets to declare the type of a new variable [ see Brinch Hansen's critique of Pascal in the Edison Papers]. In MATHS the symbol for the type is also a symbol for the universal set of that type. It is also used in expressions to indicate the type of a value - if e can have values of different types then T(e)=(e).T is the value in T.
              5. Expression Meaning Type Example. Sets {} null or empty set Ambiguous Numbers < 1 and >2
              6. T universal set @T Numbers
              7. {e1} singleton @T (e is of type T) {1}
              8. {e1,e2,e3,...} extension @T (e1,e2,...are type T) {1,2,3}
              9. {x:T || W(x) } intention @T {n:Nat||n*n=4}={2} ${D} Set satisfying documentation D ${x,y::Real. x^2+y^2=1.0}
              10. N Set satisfying documentation Named N MONOID

              11. S1 & S2 intersection @T><@T -> @T ..n&m..=n..m
              12. S1 | S2 union @T><@T -> @T ..n-1 | m+1..
              13. S1 ~ S2 complement @T><@T -> @T = Nat ~(n..m)

                Predicates e1 in S1 membership T><@T -> @ 2 in 1..

              14. S1<> S2 set inequality @T><@T -> @ 1..<>0..
              15. S1 = S2 set equality @T><@T -> @ 1..2={1,2}
              16. S1==>S2 subset or equal @T><@T -> @ 1..2==>1..3
              17. S1=>>S2 proper subset @T><@T -> @ 1..2=>>1..3
              18. S1 >==S2 S1 partitions into S2 @T><@T -> @ {1,2,3}>==
              19. {{1,2},{3}}
              20. S1 non-empty set @ (overloaded)

                Short hand S1 are S2::=S1==>S2, x is in S2::=x in S2.

                We can use a set anywhere a type could be used, for example (when A:@T): for all x:A(W)::=for all x:T (if x in A then (W)) In English and mathematics the common idiom puts a type name before the variable - for example:

              21. For all real x>0 there are two real y's such that y^2=x, or in MATHS
              22. for all Real x where x>0, 2 Real y where y^2=x, meaning
              23. for all x:Real (if x>0 then for 2 y:Real (y^2=x) ).

                Another useful form is like There are two elevators and five floors. MATHS form is 2 Elevators and 5 Floors. The general form is when Q is any quantifier(all,some,none, 1, 2...) or number and A:@T,

              24. Q A::= for Q x:T (x in A). Thus|-for all Sets A, no A iff A={}.
              25. |-1 A iff for some a:T(A={a}). . . .

                Lists, strings and n-tupls

                MATHS has n-tupls and lists. They are modeled as maps from subsets of the integers.
              26. n..m::@Integer={i:Integer || n<=i<=m}.
              27. For T:Types, %T::= { x:I->T || for some n,m:Integer, I=n..m}, For each Type T MATHS defines maps 1st, 2nd, 3rd,, ...,,
                ... last from %T into T that have the obvious meanings; plus a 'prefix/postfix' operation(?) that generates all lists/n-tupls:
              28. For t:T, l:%T, t?l and l?t in %T.
              29. For t:T, l:%T, t=1st(t?l) and l= rest(t?l) and t=last(l?t) and l=front(l?t).
              30. |-(1st)?(rest) = (_) = (front) ? (last).
              31. head::=1st, tail::=2nd. %%T, are lists of lists not lists of elements:
              32. ((1), (1,2), (1,2,3)) is a different type of object to (1,2,3). Strings are also part of the MATHS - #T is the type of object generated by concatenating objects of type T together. They inherit the operations, properties, and predicates acting on lists. In addition
              33. for any x,y:#T, x ! y in #T and (!) is the free associative operation with a unit () on #T. So strings=#char and files=#record, etc. If T is the type of elementary objects then #T is the "free" semigroup. Unlike lists, ##T=#T. Here is a sample of the kind of definitions are possible using MATHS:
              34. For t:T, x:#T, A:@T, a:A, b:T~A, filter(A, ())::=(), filter(A, (a!x))::=a!filter(A,x), filter(A, (b!x))::=filter(A,x), q(())::=(), q((t))::=(t), q(t!x)::=q(filter(/<(t), x) ) ! filter({t}, x) ! q(filter(/>(t), x) ). The function q is from Wainright's Miranda examples[page 148 in Wainwright 92].

                Classes & Entities

                All modern formal notations from Ada to Z provide means for creating a new type of object by (1) listing components and (2)specifying constraints and operations on these components. In MATHS a collection of declarations defines a set of labeled tupls. If D is a collection of declarations, definitions, axioms, theorems and comments then Net{D} is called a set of documentation and Net{D}=${D} is the set or class of objects that satisfy D.

                The complete definition of MATHS structures will be delayed until chapter 5. Here are some examples of tupls, sets of tupls and sets of sets of tuples. We can think of these as records, files and data bases respectively. Or we can think of them as objects, classes, and architectures respectively. In C we might implement these as structs, arrays and files of code.

                Tupls look like this: (a=>xa, b=>xb, ...). A tuple (a=>xa, b=>xb, ...) has a type defined by one of the following forms:

              35. S::=${ a::A, b::B, c::C, ... }, or S::=Net{ a::A, b::B, c::C, ... }, or N::=Net{ a::A, b::B, c::C, ... }, S::=N. or ==============================


              36. S::=N.
              37. N::=
                1. b::B,
                2. c::C,
                3. ... }=::N ==============================


                4. phones::=${name::Strings, number::Strings}.
                5. |-(name=>"John Smith", number=>"714-123-5678") in phones. Each label in a MATHS structure can be used to extract components from tupls:
                6. (name=>"John Smith", number=>"714-123-5678").name="John Smith"
                7. (name=>"John Smith", number=>"714-123-5678").number="714-123-5678"

                  The dot fits with Ada, C, Pascal,... and mathematics since '.' projects a vector into a scalar. MATHS formalizes projections as functions mapping structured objects to their components. It uses both f(x) and x.f notations [Knuth 74]. So if x:X and X=${a::A,b::B,c::C,... } then a(x)=x.a= the a of x and a::X->A, b(x)=x.b= the b of x and b::X->B, c(x)=x.c= the c of x and c::X->C. Because projections are maps, they are relations and have a converse so that y./a= { x || x.a=y }. MATHS also defines projection operations by lists of element names:

                8. x.(a,b) = (x.a, x.b) in A><B,
                9. x.(a,b,c) = (x.a, x.b, x.c) in A><B><C.

                  One way to imagine a MATHS structure is to think of it as a record or object in a high level language or a labeled tupl in a RDBMS. The formal definition implies a collection of maps. Thus phones=${name,number::Strings} implies name in phones->String and number in phones->String. Formally, phones is the universal (free, initial, or universally repelling) object in the category of types with names and numbers. In other words: all types with two functions called name and number mapping objects into the Strings, is modeled by phones. We can map phones into any other type without loosing the information about names and numbers. Therefore phones encodes all sets of objects with names and numbers(details in Chapter 5).

                  Given the idea of a tupl and a structured type it is easy to allow a notation for collections of tupls that satisfy a given constraint - for example the set of phones in area code "714" might be written as ${name,number::String, number in "714" String. } In MATHS Net{a::A, b::B, ... z::Z, W(a,b,...z),...}= {(a=>u,b=>v,...):${a::A, b::B, ... z::Z) || for some u:A, v:B,... (W(u, v,...) ) }

                10. The following are not the same:
                11. {x:A || W(x) } Set of objects in A satisfying W [Chapter 4].
                12. {x in A, W(x) } Set with Boolean elements, either {true},{false}, or {true,false}. Net{x::A, W(x) } Documentation with variable x in set A and assertion W(x)[Chapter 5]. ${x::A, W(x) } Set of objects (x=>a) where a is in A and W(a) is true.

                  Given a set or class X:@T then we can talk about its subsets (@X). Finite sets of similarly structured elements are familiar to computer people - files, tables, and lists. For example: Suppose phone_data::=@PHONE where PHONE::=Net{name::Names, number::Numbers, place::{home,work} },

                13. Names::@Strings,
                14. Numbers::@Strings. So if P={ (name=>"John Doe", number=>"714-123-5678", place=>home),
                15. (name=>"John Doe", number=>"3678", place=>work),
                16. (name=>"Joan Roe", number=>"714-123-5878", place=>home),
                17. (name=>"Joan Roe", number=>"3878", place=>work)
                then P in phone_data. Further name(P) = = the names in P ={"John Doe","Joan Roe"}, and ("John Doe",work)./(name,work).P.number="3678".

                Codd has shown that any set of data can be expressed as simple structured tables like the above. Thus we can model all data as tuples. A collection of sets of tuples is a relational data base and forms a category. The objects in an object oriented design also forms a similar category [Freedman 83]. Structures form a conceptual model of the area being studied that is readily implemented using modern techniques such as data bases, logic programming, and object oriented programming[Chapter 5].

                Notice that P above also satisfies the pragmatic definition of a telephone book - we can look up numbers:- Given the items (name,place) they uniquely determine the remaining items (number):

              38. P in @PHONE (name,place)->(number).
              39. the P(name="John Doe", place=work)=(name=>"John Doe", number=>"3678", place=>work) We can define the phone books as just those sets in @PHONE that relate a name and place to an unique number:
              40. Phone_books::=@PHONE(name,place)->(number). Identifiers or keys like (name, place) appear in all computer applications.

                A special kind of relationship is one with two components - equivalent to a set of 2-tuples. They are turn up in all formal and natural settings:

              41. x above y
              42. x > y
              43. x = y
              44. x owns y
              45. x is_the_mother_of y The set of all relations between sets A and B is written @(A,B). A relation is typically defined by a formula like the following:
              46. For x:A, y:B, x R y::= W(x,y), where W(x,y) is a well formed formula that returns {true, false}. There is also a formula for R it self:
              47. |-rel[x:A,y:B](x R y) = R. We can also treat relations as sets of pairs or as mappings in context:
              48. {(x, y):A><B || x R y } , map[x:A,y:B](x R y). Sections 6.3 and 6.4 below follow the thread of binary relations to derive a model of program specification and design of surprising power and generality.

                MATHS has ways of making relationships with more than two components readable. First, we can declare n-ary relationships to be a binary relations between structures.

              49. pays::@(person, ${it::money, to::person}).
              50. gives::@(person, ${it::object, to::person}).
              51. sells::@(person, ${it::object, to::person, at::money}).
              52. buys::@(person, ${it::object, from::person, at::money}).
              53. For a,b:person, p:object, m:money,
              54. a buys(p, from=>b, at=>m)::=b sells (p, to=>a, at=>m),
              55. b sells(p, to=>a, at=>m)::= a pays (m, to=>b) and b gives (p, to=>a). The expressions then simulate normal English: a buys (p, from=>b, at=>m). There is a mild abuse of notation in that the label 'it=>' is omitted. The notation is similar to Smalltalk key-word messages: a buys: p from:b at: c or Grady Booch's Ada style. Further natural predicates make complex logical functions readable. As an example:
              56. For a,b:person, m:money, p:object, (a buys p from b price m)::=a buys(it=>p, from=>b, at=>m).
              57. |-buys =(1st) buys (2nd;it) from (2nd;from) price (2nd;at). Further work on a "natural" or "distfix" syntax is speculative.

                Standard technology and techniques can be used to implement objects, classes, sets and mappings - using a DBMS[DATA] or data structures plus OOPs[DATA & OOP]. An alternative is to treat each entity and relationship as a functor or predicate in a logic programming system[Keller 87, PROLOG].

                A.6 Subclasses and Subtypes Commonly an entity type (= class = Ada type) has different forms(=subclasses=Ada Subtypes). For example there are several species of mammal: MAMMAL(cat), MAMMAL(dog), ... . We get this effect by defining

              58. MAMMAL::=Net{ ANIMATE, species::{feline, canine, equine, ...},
              59. cat::@=species=feline, dog::@=species=canine,
              60. horse::@=species=equine,...

            A more sophisticated syntax is :
          34. MAMMAL::=Net{ ANIMATE. Either cat, dog, horse,..., or ape. }.

            The syntax "Either...,..,..,or..." is short for a net of Boolean declarations and formulae. Sometimes there are normal and abnormal forms. In MATHS it is possible to develop two pieces of documentation for each case and keep what is common to both in the original [for example see the theory of ORDERS in Appendix 1]. Another technique is to declare a Boolean variable and use it as a condition. For example, normal dogs have four legs:

          35. MAMMAL::=Net{ ... legs::@LIMB, normal::@. if normal then |legs| =4. ...}.

            Serial Data

            Programs can access data from a source, one datum at a time. They need to put items into a stream of data one at a time. Commonly the same syntax also sends streams of messages to other programs and files. MATHS's string and list operators perfectly model reading and writing serial streams of data.

            If we are interested in strings of objects all in a set A. #A is the set of finite sequences of things in A. MATHS defines the following (See Chapter 4(section 8), chapter 5, and sample at the end of book)

          36. s ! t::=the concatenation of strings s and t.
          37. s:!x = ( s'=s !x ) -- put a copy of x 's value at the end of s,
          38. x!:s = (x ! s = s') -- put a copy of x 's value in front of s [compare Lam & Shankar 90]. The attach element to string operation is symbolized "?" and the following are defined:
          39. e?:s = ( e ? s' = s ) -- e was the first item of S and is removed.
          40. x'?:s = ( x' ? s' = s )-- the first item is popped off s into x
          41. s:?x' = ( s' ? x' = s ) -- get the last x from s There is a small technicality that arises{2} - there must be an explicit "end of sequence" signal after the end of the data. MATHS uses the word 'end' as a generic "There ain't gonna be no more" message. By convention 'end' is never used as data in a string.

            Here is an example Data Directed solution to the problem of replacing double asterisks by a caret symbol[Hoare 78].

          42. star::="*", caret::="^". The sets of possible inputs and outputs are I and O respectively:
          43. I::= #( star star | char) end,
          44. O::=#( caret | char) end. Where star star corresponds to caret and I to O:

            The variables i and 'o' are used for the standard input and output of a program so:

          45. program::=(
          46. while( i.1st<>end )(
          47. if "**"!:i then o:!"^"
          48. else x'?:i; o:!x
          49. fi
          50. );
          51. o:!end; end?:i;


            The above looks deceptively like pseudocode, but may demand unusual coding techniques.

          . . . . . . . . . ( end of section Appendices) <<Contents | End>>

          Footnote {2}

          When one process generates the input to another parallel process it is not clear whether an empty input is merely as yet ungenerated data, or if the generator has terminated.

        . . . . . . . . . ( end of section Paper) <<Contents | End>>

      . . . . . . . . . ( end of section How Far Can EBNF Stretch?) <<Contents | End>>