Disclaimer. CSUSB and the CS Dept have no responsibility for the content of this page.
Copyright. Richard J. Botting ( Wed Dec 20 09:09:28 PST 2006 ). Permission is granted to quote and use this document as long as the source is acknowledged.
Abstract
Extensions to Backus Naur Form have been in use for nearly 30 years. EBNF is
one of the few forms of documentation that is both formal and practical. It
can be extended much further and becomes a paradigm for the formal
documentation of requirements, designs, and specifications of all types of
software.
I define EBNF formally and then to explore the consequences of making small changes to the syntax and semantics of EBNF. This gives rise to a version called XBNF that has proved useful in teaching a programming languages course.
We can define terms of other types: Sets, functions, relations, types, classes, ... Definitions become a way of naming. Documentation becomes a "first class object" for its users. The theory of the resulting system seems to need recently developed theories of types, semantics, etc[Example: Leeuwen
Relation to other work
This is part of an experiment on inexpensive formal documentation. The
working name for the language is MATHS [C Sci Seminar CSUSB 1992]. I have a
graphical version called TEMPO [Faculty Research Seminar CSUSB 1992].
. . . . . . . . . ( end of section Introduction) <<Contents | End>>
Grammars & Dictionaries
Name Elements Defined_terms Some Sources
Syntax,
Grammar Tokens|Lexemes Syntactic units Naur et al 64,...Ada 93
Lexical.AS_is Dictionary Strings Tokens|lexemes Bell Labs 83(lex)
Protocol Messages Communications protocols>As_is Zave 85, Naik & Sarikaya 92
Messages
Methods Objects Botting 86a
RFCs Lines Email Messages Internet RFCs 80..93
Data
Dictiona Data types Data files De Marco 80, JSP
Data
StructureData types I/O Patterns Warnier/Orr, JSP,
Path expressions Subprogram calls Paths, S/SL
Entries names Ada task structureHemendinger 90
Dynamic Dictionary Events Significant patternsKleene 1940s,DAD
Commands System BehaviorsBelli & Grosspietsch 91
ELH Events Life HistoriesRobinson 79, JSD,SSADM
Logic Grammar Simple Terms PredicatesAbramson & Dahl 89
Program Structure C Function Calls C Function DefinitionsBaecker & Marcus 90
Language Theory a..z A..ZEilenberg 74,...THEORY
Theory(Minimal)
The theory of EBNF has been the same for 30 years and is based on defining
"string" and "sets of strings." Strings come from an empty string by
concatenating symbols. I will show concatenation of strings by an exclamation
mark (!). It is assumed to be an associative operation with a unit - the empty
string "". Each string, by definition, has a unique sequence of atomic
symbols, which when concatenated generate the original string. Temporarily we
symbolize strings by lower case variables and sets of strings by upper case
variables. Example strings will shown in the C notation (see later if
necessary). We next define operations of union(|), and concatenation () on
sets of strings by:
These give a paradigm for defining the meaning of any grammar. For instance, the grammar { L::= a L b | "". } has one definitionL::= a L b | ""., one defined_term "L" and two elements: "a", "b". We define the meaning of "L" to be M as the smallest set of strings such that the following equation is true:
XBNF
XBNF is EBNF with a notation which can be used with other mathematical
concepts and so more power. XBNF uses white space, bars ("|") and the symbol
XBNF in XBNF
This section is an XBNF dictionary describing XBNF:
In a lexical dictionary the following definition of element is used:
Another standard kind of element is the identifier. An identifier does not start or end with an underscore - it is made of letters, digits and break characters, and must always start and end with either a letter or digit:
Often, a language is defined by two dictionaries - a lexicon that gives names to strings of symbols, and a syntactic dictionary or grammar. The grammar describes the structure of the language in terms of the tokens identified in the lexicon. In this presentation I combined both lexical and syntactic "shells" into a single presentation. Normally they would be split up into separate pieces of documentation - to do this I have experimented with a formalism that treats a dictionary as a module. This is described later. Meanwhile we will assume that the terms defined in the standard lexicon for the ASCII code can be used in future examples: thus terms like comma, r_paren, etc will be used in later syntax definitions. I would suggest that we extend EBNF to also be a glossary. A glossary definition is a definition in the form:
A similar three level approach is useful in data processing applications - the analyst preparing a data dictionary that is equivalent to a lexical description of the data, the program designer working out the structures caused by logical and physical constraints, and a team thrashing out the glossary definition. It is possible and useful however to keep all definitions of a single term together rather than in separate files of sections of a document.
Such a three way approach has been tried out in class. Any user with ~dick/bin in their PATH can query a data base of half-a-dozen language specifications written in XBNF:
Note on Precedence (Optional)
Because union is defined as made of alternatives which, in turn, contain
sequences we have:
Effect of Intersections
The theory of defining a language as the intersection of a finite number of
context free languages was developed by Liu and Weiner in 1973. For practical
examples see: the Turing project [Holt & Cordy 88]. , S/SL, Jackson's work,
etc.[Hehner and Silverberg 83]. Warnier's notation permits intersections but
they are only used to derive conditions in programs. Intersections let us
define languages that are not context free:
Defining a language as the intersection of several grammatical constraints is neither new nor inefficient - Constrained Expressions are expressive and based on simple regular expressions [Dillon et al 86]. Hehner and Silverberg[83] use the term Communicating Grammars for a set of grammars with some terminals and non-terminals occurring in both. The model here is simpler - I propose the terms concurrent context free grammars (CCFG) for the dictionary model proposed here, and concurrent context free languages(CCFLs) for the family of languages defined by CCFGs. Hemendinger notes that Ada tasks have communication patterns that are CCFLs[Hemendinger 90]. Dictionaries with intersections define languages that are recognized by networks of parallel CFG acceptors. Psychologists use a network of "daemons" for modelling perceptual processes [ Human Information Processing, Lindsay & Norman, Pub Academic Press, 1970]. Finally: the hardest part of Jackson Structured Programming [JSP] is the use of semi-co-routines (in COBOL!) and to handle non-context- free problems by quasi-concurrency[Jackson 1975...].
Effect of Functions
Using functions in a grammar has the practical effect of letting us name and
use short hand formulae for common constructions such as lists. However they
permit the definition of context dependencies in a natural way. First we
need a way to define a function and a way to express its image. The natural
way to define simple functions is too allow the use a symbol like (_) as the
place where the argument of the function must be written:
Now, this is a context dependent set of strings and yet the definition shows us a simple and efficient way to parse (and check the syntax) loops: First, use CFG a parser for
Note
In XBNF '@' means 'subsets of' so
we can define a complicated feature of a language in stages, first
declare that it is a subset of a context free language and later provide the
specific rules.
Combining intersection with functions is interesting. Suppose that we have already defined declaration(V), usage(V), and no_ref(V) as the strings in a language which (1) declare or introduce V, (2) make use of V, and (3) don't refer to V respectively then:
Definitions that include parameters are useful for defining and specifying maps and relations. Sets of definitions that include parameters are called Logical Grammars [ Abramson & Dahl 89 ]. A more general form of dictionary is the W-Grammar, or Two-level Grammar or Van-Wijngarten Grammar that contributed to the demise of Algol 68 yet is alive as a tool for documenting data structures[Gonnet & Tompa, Gonnet & Baeza-Yates 91]. The effect is that of a logic programming language [Ballance et al 90].
. . . . . . . . . ( end of section Grammars and Dictionaries) <<Contents | End>>
Applications
Files
Abstract models are easier to re-use than concrete ones. Any system that fits
the theory of strings (a monoid) plus regular sets(A boolean algebra) with
kleene closure can use dictionaries. The grammar can be put to work to
describe files in terms of data in stead defining languages in terms of
characters. The equivalence of a lexical dictionary is the analysts "data
dictionary". The counterpart of the syntactic grammar is a Jackson or Warnier
"data structure". All we need is an empty object and a concatenation
operator. For example if T stood for a set of records and #T for files,
then the empty string models the empty file (UNIX /dev/null) and
concatenation models writing one file after another (UNIX: cat a b >c).
Consider
These define the magnetic tape's physical structure. If we were designing a program to process a file of student records we might describe a file sorted by sections as follows:
Strictly speaking a definition of form A::= B | A~B is a theorem `|- A = B | A~B`.
If the domain has the words "file", "section" and "student" we use a similar structure, even if the class header is absent and a student record identifies the student's class instead. The result is called a logical structure.
If the problem is to summarize the data on each section then we might show the output as:
A data dictionary is free of semantic and pragmatic structures because a file can be used for many purposes and hence is parsed into different structures. Typically:
Now if a,b,c,... are elements then
is the free semigroup or unstructured structure. In categorical terminology this is a free, initial, or universal repelling object in the category of semigroups with units. Free dictionaries are the weakest useful dictionaries. In consequence a design that has two or processes that communicate by consuming and producing strings(files) can interpret or structure the files as they wish. This is the powerful form of information hiding that is implicit in Jackson's work.
Data dictionaries can show a non-programmer what a program is expected to do in their own terminology not computer jargon. The canonical grammar/dictionary /glossary shows the user or client that we understand all their terminology. Canonical diagrams and dictionaries show how the user's data is handled in terms that the users and clients understand. The test of a canonical dictionary is that clients and users certify it as complete and accurate. Canonical dictionaries are the strongest useful dictionaries.
So if we wrote
The result is a notation like the command line arguments in a C program, a UNIX shell script, ICL George 3 Macros, ... The same simple approach, starting with simple operations like successor (succ), predecessor(pred), and a constant zero(0) lets us develop the Peano arithmetic in EBNF_like definitions:
Here the semicolon acts like "if_then_": (;)=if (1st) then (2nd) but is actually the relational product described next.
I have shown that BNF definitions can define functions in a natural way.
Further if 1st and rest map a string into two parts so that x=1st(x)!rest(x), then
Binary Relations
A relation R in @(T1,T2) can be defined in terms of a well formed formula
W(x,y) by
/R is the converse relation to R. For example (_^2) is the 'square' function and /(_^2) is the square root relation, Similarly sin is the sin function and /sin includes the arcsin function. Further we want /parent_of=child_of. Therefore define:
In context, a set can also be used as a relation:
Complex situations can be defined quickly using these notations in and EBNF- like way: Suppose that we have sets male and female, and a relation parent:
The last four definitions above are like productions in context free grammar. The definitions of ancestor and lineal_descendant illustrate the transitive closure or ancestral relation. It is so common that a special notation is defined:
Example:|-ancestor=parent; do(parent) and
The defining property of do is that if something is invariant (or fixed, or closed) under R then it will not be changed by any number of Rs. This was originally in Whitehead and Russell's Principia the same form is in Floyd's loop invariant technique and reminiscent of Kleene closure in grammar theory. This definition relies on using higher order quantifiers[cf Guerevich 87].
Is a subset of">Note Is a subset of
Normally X in @Y is written X==>Y and means that X is a subset of Y
EBNF is a usable context free grammar that uses notation equivalent to Kleene's Regular Expressions. Kleene developed Regular expressions as a way for documenting Events in Neural Networks. They were put to good use in both automata theory and grammar theory in 1950's and 1960's. What is not well known is that they have also been in use (from about 1975 onward) to document that possible patterns of events that effect an entity in a system [Kleene,DAD, Robinson 79, JSD,SSADM]. Since 1980 the behavior of communicating sequential processes has been a hot topic for research [ Hoare, Milner,....] . The formulations always seem to involve an XBNF like component to describe the behavior of objects and entities.
Here is simple example: A button for an elevator can be pushed many times before an elevator finally services it:
Another example: A stack accepts items that are pushed onto it and requests for items to be popped off it. When unempty it makes sense to request a copy of the top item one or more times.
Jackson has shown how such formula lead directly to the designs of software that behaves as specified[JSD, SSADM,...].
Specifications
Relations behave like the operations and conditions in a programming language
because each operation in a program relates the current state to the set of
possible future states. If there was always a single next state then the
relation is deterministic, and by composing deterministic steps carefully we
get an algorithm. In turn an algorithm can be coded for input to a suitable
translator giving an executable program. Thus a relation describes a set of
acceptable behaviors and the implementor selects a particular algorithm
[compare with Kowalski, Hoare 87].
MATHS relations can be written in a form that makes them look like statements in a programming language. Some syntactic sugar is defined for those who want to write while/for/loops/. . . . Here is an introduction. Formal treatment comes later when we have the formulae to handle it [Chapters 5].
The key definition is that of a dynamic predicate. A predicate is a proposition that has free variables. A dynamic predicate has free variables with a prime (') attached to them. Here is an example:
After the statement, the value of x is equal to the sum of the old values of x and y. y is unchanged. It could be coded as x:=x+y in Ada/Pascal/... The rule is that "a prime means the new and no prime means the old" [Ross Ashby 56, Part II of Lam & Shankar 90, compare Hehner 84a & 84b, Guttag & Horning 91]. Formally x'=x+y defines a relation on a structured set with variables x and y.
guarantees that x is 1 after the operation but no other variables can change during the operation.
means that the value of x will be complemented - its new value will be the result of subtracting its old value from one while all other variables stay the same.
has no primed variable. It means that x has value one, and does not change - all variables are left untouched. It is a relation that holds between identical states when and only when x has value 1. It filters out all states when x is not 1. It allows computation under a condition. We call it a "Condition" therefore. It can be shown that it is a condition in the sense of Arbib and Manes [87].
Here is a more general statement of the MATHS interpretation of a predicate that contains variables with primes. Consider a simple state space S which declares m variables x1, x2,...x[m] of types X1, X2, ...X[M]:
Let P(y1',y2',y3',...y[n]',x1,x2,x3,...x[m]) be a predicate with n distinct primed variables(y1, y2,...y[n]) which also may also appear without primes as xs. For example in x'=x+y, x1=x, x2=y, and y1=x.
The relation defined by P(y1', y2', y3', ...y[n]', x1, x2, x3, ...x[m]) is found by:
A more rigorous definition will be worked out in my monograph meanwhile here are some examples. To specify a process that solves two simultaneous linear equations we write
To specify a process that solves a quadratic equation we write
On the other hand
specifies that x is left alone and a, b, and c are adjusted to fit the equation.
Program Structures
Consider an expression that uses (do), (|), (;) from the calculus of relations
to structure dynamic predicates, then it specifies, at a high level a program.
For example:
The definition for SUM looks like a program but is an expression in the calculus of relations and so has provable properties. First since the (+) is defined to be an operator on numbers, so the state space is ${n,s::Numbers}. Next define names for the parts of SUM:
Now after E we know that n=101 (and s has its previous value), since
Similarly after S we know that n=1 and s=0. Further not E by definition
will not change the values of s and n. Next we look for something that
does not change in L. Both s and n change - in parallel:
Similarly|-F in inv(W) and inv(E),
so|-F in inv(W;L).
Thus|-(F; do(W; L)) ==> F.
Now|-S ==> F and F;E==>F.
Hence, after SUM, s= 1+2+3+4+...+(n-1) where n=101 = 5050.
Notice that MATHS is not a programming language. So
permits y.i=2 even if i<>1, but if x.i=1 then y.i MUST be 2. Whatever happens y.i is either 2 or 3. Such non-determinism seems to be essential for a simple model of processes[Dijkstra 76, Hoare 78, Hoare 85, Gabrielian & Franklin 91].
MATHS includes structured programs because For R:@(S,S),
We can, for example, when n<=m define:
Using Chapter 4.Section 8,
MATHS even has an extension of Dijkstra's Guarded Command. Any group of MATHS users can document and then use its own set of abbreviations. For example:
MATHS can express mixtures of conditions and actions in one relation. A well known enigmatic program is often written:
In MATHS it can be transformed to read:
MATHS makes the derivation of programs more logical: (1) state the properties of the changed variables as a set of equations and then (2) solve for the primed variables. Similarly, (1) give non-deterministic relationships and reduces them to a form that is compatible with a declarative programming system such as Prolog [Compare Lamport 87].
Finally, MATHS relational expressions follow the calculus of relations[Whitehead & Russell]which can be used to simplify sequences into disjunctive normal form, like the following:
Documentation.
There is no advantage in using mathematics only as a program design language.
The big pay-offs will occur by introducing formal methods where the errors are
made - early in the process. Most errors are made in the analysis of
requirements - misunderstanding, ignorance, ambiguity, and jumping to
conclusions. Mathematics is best used recording ideas and defining terms
throughout a project. For example it is not useful to a client to just quote
specification and algorithm:
Shorter forms are worse [Knuth et al 89]:
GCD can then be referred to by Uses GCD in other pieces of documentation.
An abstract specification permits innovative but correct implementations to be invented. For example, if one has to evaluate a many GCD's then Euclid's algorithm is clearly inefficient, and remains so when coded iteratively. For example, when gathering statistics on the GCD's of all pairs of numbers in the range 1..2^p as p+>oo I found that the iterative version of Euclid's algorithm recalculates the same values many times. Backtracking to the definition in above (GCD(def gcd) ) we have: (GCD(def gcd) )|- For x,y, gcd(x,y)::=greatest(factors(x) & factors(y)).
Therefore: gcd in (Nat >< Nat)->Nat. Now ((Nat >< Nat)->Nat) is monomorphic to @(Nat, Nat, Nat). So gcd can be a table, a data base, a data structure, or even a hybrid of data and algorithm:
Requirements and Domain Analysis
XBNF can be extended or embedded into an ASCII based language that I have called MATHS. MATHS lets terms of any type to be defined - bits, numbers, functions, specifications of programs, ... . To make this work we need an extended theory of the meaning of a set of definitions. MATHS assumes that each definition adds a variable and also adds an axiom:
In general I have been studying packages of declarations and well_formed_formula. Syntactically:
Nets are collections of assumptions not programs. Associated with a set of declarations is an abstract space. A declaration adds a dimension to the space and an equality removes a dimension. In consequence a definition that does not involve recursion leaves the system unchanged. Some recursive forms invalidate their universe but others(for example LISP) has a BCNF which is not recursive form.
MATHS borrows an idea from the Ada package and the Z schema: networks can be named.. The original BNF notation is appropriate:
Nets are not interpreted as logic program. In many ways they can thought to define algebraic systems rather than a program,... indeed in most of these system the Unification Algorithm is not a valid deduction. MATHS provides the ability to show that the uncertain statements must follow from less doubtful ones. A special sign (|-) shows that the formula can be derived from previous formulae. A complete inference engine or automatic theorem prover will be an uneconomic investment since the engineer can work interactively with a simpler proof checker. A proof typically has a Natural Deduction form like this:
Thus EBNF has become a way to document mathematical and logical systems:
The effect is to formalize what mathematicians have always done and simultaneously introduce inheritance into computer documentation. I have converted most of my notes on algebra and topology into this form on a 7K floppy disk.
Here is an example of a practical application based on a recent publication: [Snyder 93, Alan Snyder<alan.snyder@eng.sun.com>, The Essence of Objects: Concepts and Terms, IEEE Software V10n1(Jan 93)pp31-42]
We proceed by abstracting noun phrases from these sentences and replacing them by sets and relations:
. . . . . . . . . ( end of section Applications) <<Contents | End>>
By extending EBNF to XBNF and then to MATHS we get a powerful way to document the domain in which software is or will be operating. "An application domain system model of the equipment and its intended environment of use is defined. This is a dynamic system model, and defines the overall states of the system as functions of time. Requirements are constraints on this model" [Hansen Ravn & Rischel 91 ]. Simple mathematics - sets, mappings, relations, definitions, assertions, etc.-
MATHS is more abstract than any other computer language. It is designed to be too powerful for any conceivable computer to use as a programming language because the language used at the sharp end of a project must be able to define problems well enough so that we can recognize that they are (1) hard, or (2) not computable. Problems of efficiency, computability, data vs algorithms, etc. are handled as part of the implementation. The MATHS description of the problem and the specification of a requirement should be independent of all implementations and technology. Ignoring efficiency etc. is difficult at first. It is essential to describe the purpose and qualities of something before choosing or testing techniques. Hence:
MATHS & XBNF help solve several outstanding problems in the software process. First, we can show how the user's data is handled in terms that they understand. We can guarantee that this understanding is accurately modelled in the software by using rigorous design techniques. Second, Strachey pointed out the need to discover data structures before attempting functional design [Strachey 66]. The sets and maps found by formal analysis connect facts to conceptual data structures. Standard data base and data structures techniques can then implement the conceptual solution to the problem.
Projects that use MATHS can "zero-in" on the best software. First: Define things that appear in the description of the problem and (1) are outside the software, (2) interact with the software, and (3) are individually identifiable by the software. Second: Define events, entities, identifiers, relationships, attributes, facts, and patterns. These steps are an exercise in stating the obvious. Facts are recorded and verified by experts and by observation. Simple facts lead to validation of algorithms, data structures, and data bases.
If a system interacts with another system then it will have an internal model of the external system. An abstract model of the external system is a model for the internal system and so a structure for a design. Computer Science already teaches the techniques to implement the designs plus the calculus of sets, maps, and relations. So software design can be based on science. There is also a natural graphic representation (TEMPO). This is a basis for Software Engineering.
Formal Analysis proceeds by stating the obvious and so certain truths. Simple facts often delimit the set of logically correct structures. Grammars and dictionaries are used to document the data and dynamics of a domain. Relations can specify processes in a structured dictionary. Sets and maps define entity types, abstract data types, and classes of objects of the problem domain and are used to document requirements and specifications. When implemented, maps become paths through the data structures and data bases. So they define the structure shared by solution and problem.
Project documentation also describes the physical constraints caused by the technology used. Again there are two extremes: free and inconsistent. Some where between them is the best level. This records every assumption that can be made about the implementation technology. It is the canonical documentation of the technology being used when the software engineers are its clients and users.
Levels of Documentation
Some documentation says too little, and some says too much. Intuitively we
have a category of logical systems. In this category of documentation we can
distinguish two extremes. There is a free, initial, or universal repelling
object in the category of documented objects. Free or initial documentation
is the weakest useful documentation. At the other extreme there are final,
inconsistent, or universally attractive documentation - this overdefines the
situation to the point where it can not exist. In between there is the level
of documentation that reflects exactly what the client understands the
situation to be. I call this the canonical documentation. It forms the
cannon or Bible for the project. It shows the user or client that we
understand all their terminology. Canonical documentation shows the user's
world in terms that the users and clients understand. The test of canonical
documentation is that clients and users certify it as complete and accurate.
Canonical documents are the strongest useful conceptual documentation.
This annex summarizes the notation I have developed to express discrete mathematics in ASCII.
. . . . . . . . . ( end of section Introduction) <<Contents | End>>
Propositions and Predicates
A proposition can have two possible meanings - true and false. In MATHS the
symbol @ represents the Boolean type with two standard values {true, false}.
The following table defines the operators.
P true false Syntax Type not P false true prefix @->@
P true true false false Q true false true false Syntax Type P and Q true false false false infix(serial) @><@->@ P or Q true true true false infix(serial) @><@->@ P iff Q true false false true infix(parallel) @><@->@ if P then Q true false true true natural @><@->@ Unlike most programming languages, mathematics has two kinds of infix operator:
A predicate is a proposition that can have variables and quantifiers. MATHS spells quantifiers out and includes numerical quantifiers like "one", "no", "0..1" as well as "all" and "some". The syntax attempts to make the writing of formal statements as natural as possible. In the following X is a set or a type, xis a variable and W(x) a proposition containing x in a number of places. for all x:X(W(x))::=For all x of type X, W(x) is true, for x:X(W(x))::= for all x:X(W(x)), for no x:X(W(x))::=for all x:X(not W(x)), for some x:X(W(x))::=not for no x:X(W(x)). for 0 x:X(W(x))::= for no x:X(W(x) ), for 1 x:X(W(x))::= for some x:X( W(x) and for 0 y:X(x<>y and W(x)) ), for 2 x:X(W(x))::= for some x:X( W(x) and for 1 y:X(x<>y and W(y)) ). for 3 x:X(W(x))::= for some x:X( W(x) and for 2 y:X(x<>y and W(y)) ).
Sets
For any type T there is an another type symbolized by @T with objects that
are sets of objects of type T. In MATHS [unlike Pascal but as in Ada] a
subset of a type is not a type, but a set. Each set is made of elements of
one type and there type is determined by the type that the set does belong
to. If a set A is of type @T then As elements must be of type T. It
is therefore unambiguous to use sets to declare the type of a new variable [
see Brinch Hansen's critique of Pascal in the Edison Papers]. In MATHS the
symbol for the type is also a symbol for the universal set of that type. It
is also used in expressions to indicate the type of a value - if e can have
values of different types then T(e)=(e).T is the value in T.
Predicates e1 in S1 membership T><@T -> @ 2 in 1..
Short hand S1 are S2::=S1==>S2, x is in S2::=x in S2.
We can use a set anywhere a type could be used, for example (when A:@T): for all x:A(W)::=for all x:T (if x in A then (W)) In English and mathematics the common idiom puts a type name before the variable - for example:
Another useful form is like There are two elevators and five floors. MATHS form is 2 Elevators and 5 Floors. The general form is when Q is any quantifier(all,some,none, 1, 2...) or number and A:@T,
Lists, strings and n-tupls
MATHS has n-tupls and lists. They are modeled as maps from subsets of the
integers.
Classes & Entities
All modern formal notations from Ada to Z provide means for creating a new
type of object by (1) listing components and (2)specifying constraints and
operations on these components. In MATHS a collection of declarations defines
a set of labeled tupls. If D is a collection of declarations, definitions,
axioms, theorems and comments then Net{D} is called a set of documentation
and Net{D}=${D} is the set or class of objects that satisfy D.
The complete definition of MATHS structures will be delayed until chapter 5. Here are some examples of tupls, sets of tupls and sets of sets of tuples. We can think of these as records, files and data bases respectively. Or we can think of them as objects, classes, and architectures respectively. In C we might implement these as structs, arrays and files of code.
Tupls look like this: (a=>xa, b=>xb, ...). A tuple (a=>xa, b=>xb, ...) has a type defined by one of the following forms:
______________________________
Example
The dot fits with Ada, C, Pascal,... and mathematics since '.' projects a vector into a scalar. MATHS formalizes projections as functions mapping structured objects to their components. It uses both f(x) and x.f notations [Knuth 74]. So if x:X and X=${a::A,b::B,c::C,... } then a(x)=x.a= the a of x and a::X->A, b(x)=x.b= the b of x and b::X->B, c(x)=x.c= the c of x and c::X->C. Because projections are maps, they are relations and have a converse so that y./a= { x || x.a=y }. MATHS also defines projection operations by lists of element names:
One way to imagine a MATHS structure is to think of it as a record or object in a high level language or a labeled tupl in a RDBMS. The formal definition implies a collection of maps. Thus phones=${name,number::Strings} implies name in phones->String and number in phones->String. Formally, phones is the universal (free, initial, or universally repelling) object in the category of types with names and numbers. In other words: all types with two functions called name and number mapping objects into the Strings, is modeled by phones. We can map phones into any other type without loosing the information about names and numbers. Therefore phones encodes all sets of objects with names and numbers(details in Chapter 5).
Given the idea of a tupl and a structured type it is easy to allow a notation for collections of tupls that satisfy a given constraint - for example the set of phones in area code "714" might be written as ${name,number::String, number in "714" String. } In MATHS Net{a::A, b::B, ... z::Z, W(a,b,...z),...}= {(a=>u,b=>v,...):${a::A, b::B, ... z::Z) || for some u:A, v:B,... (W(u, v,...) ) }
Given a set or class X:@T then we can talk about its subsets (@X). Finite sets of similarly structured elements are familiar to computer people - files, tables, and lists. For example: Suppose phone_data::=@PHONE where PHONE::=Net{name::Names, number::Numbers, place::{home,work} },
Codd has shown that any set of data can be expressed as simple structured tables like the above. Thus we can model all data as tuples. A collection of sets of tuples is a relational data base and forms a category. The objects in an object oriented design also forms a similar category [Freedman 83]. Structures form a conceptual model of the area being studied that is readily implemented using modern techniques such as data bases, logic programming, and object oriented programming[Chapter 5].
Notice that P above also satisfies the pragmatic definition of a telephone book - we can look up numbers:- Given the items (name,place) they uniquely determine the remaining items (number):
A special kind of relationship is one with two components - equivalent to a set of 2-tuples. They are turn up in all formal and natural settings:
MATHS has ways of making relationships with more than two components readable. First, we can declare n-ary relationships to be a binary relations between structures.
Standard technology and techniques can be used to implement objects, classes, sets and mappings - using a DBMS[DATA] or data structures plus OOPs[DATA & OOP]. An alternative is to treat each entity and relationship as a functor or predicate in a logic programming system[Keller 87, PROLOG].
A.6 Subclasses and Subtypes Commonly an entity type (= class = Ada type) has different forms(=subclasses=Ada Subtypes). For example there are several species of mammal: MAMMAL(cat), MAMMAL(dog), ... . We get this effect by defining
The syntax "Either...,..,..,or..." is short for a net of Boolean declarations and formulae. Sometimes there are normal and abnormal forms. In MATHS it is possible to develop two pieces of documentation for each case and keep what is common to both in the original [for example see the theory of ORDERS in Appendix 1]. Another technique is to declare a Boolean variable and use it as a condition. For example, normal dogs have four legs:
If we are interested in strings of objects all in a set A. #A is the set of finite sequences of things in A. MATHS defines the following (See Chapter 4(section 8), chapter 5, and sample at the end of book)
Here is an example Data Directed solution to the problem of replacing double asterisks by a caret symbol[Hoare 78].
The variables i and 'o' are used for the standard input and output of a program so:
).
The above looks deceptively like pseudocode, but may demand unusual coding techniques.
. . . . . . . . . ( end of section Appendices) <<Contents | End>>
Footnote {2}
When one process generates the input to another parallel process it is
not clear whether an empty input is merely as yet ungenerated data, or if the
generator has terminated.
. . . . . . . . . ( end of section Paper) <<Contents | End>>
. . . . . . . . . ( end of section How Far Can EBNF Stretch?) <<Contents | End>>
End