[Skip Navigation] [CSUSB] / [CNS] / [CSE] / [R J Botting] / [Monograph] / 02
[Index] || [Contents] || [Source] || [ Search monograph] || [Notation] || [Copyright] Mon Aug 2 16:06:14 PDT 2010

Contents


    Chapter 2

      This is a snapshot of work done in the 1990's. Much of this material has been extracted, revised, and placed in [ ../maths/ ] and [ ../papers/ ] on this web site.

      2.1 Introduction

        2.1.1 Summary

        MATHS+TEMPO are experimental notations designed to see if there is an inexpensive way for computers to represent formal documentation in the software process.

        (1) MATHS is a Multi-purpose Abstract Terminology, Heuristic, and Symbology. (2) MATHS makes formal methods easier. (3) MATHS fits many methods. (4) MATHS is modular and reusable. (5) MATHS is a cheap way to write mathematics.

        This chapter introduces a graphic representation of MATHS that is called TEMPO. This name is short for "Temporal Map".

        2.1.2 Relation to other chapters

        Chapters 0 and 1 showed the need for formal and graphic notations throughout the software process. Chapter 1 ended by stating informal requirements for such notations. This chapter introduces MATHS and TEMPO by using examples. Reference sheets to standard MATHS notations are at the end of the book. Appendix 1 gives a sample of "raw" MATHS documentation. Chapters 3,4, 5 define MATHS using MATHS. Chapter 6 relates it to existing methodologies. Chapter 7 formally defines the relationships between formulae and graphics. Chapter 8 gives implementation plans. Appendix 0 defines the requirements, Appendix 1 is a sample, and Appendix 4 is a set of reference sheets.

      . . . . . . . . . ( end of section 2.1 Introduction) <<Contents | End>>

      2.2 EBNF & Dictionaries

        2.2.1 Introduction

        Bugs often occur because ideas are not clearly defined. So projects need to document definitions. Call a set of definitions a dictionary . Common forms are the EBNF grammar and the Data Dictionary. They are already in use through out software processes:


        Table
        Name|PurposeElementsDefined_termsSources
        Syntax, GrammarTokens|LexemesSyntactic unitsNaur et al 64,...
        Tagged CFGLexemesTagged sets of stringsMark & Cochrane 92
        Lexical DictionaryStrings,character namesTokens|lexemesBell Labs 83(lex )
        ParserLiterals & CharactersText form of ObjectCELLAR[pp21-22 of Rettig 93]
        Data DictionaryData types and valuesData filesDe Marco 80,...
        ?CommandsSystem BehaviorsBelli & Grosspietsch 91
        Data StructureElementary DataFiles, Streams...JSP
        Dynamic DictionaryEventsSignificant patternsKleene,DAD
        "EventsEntity Life HistoriesRobinson 79, SSADM
        "MessagesCommunication protocolsZave 85, Naik & Sarikaya 92
        "Messages|MethodsObjectsBotting 86a, Dedene & Snoeck 94
        Scenario GrammarEventsStates in ScenarioHsia et al 94
        Dialogue GrammarsStimulus/ResponseDialoguesBoute 81
        Task DocumentationManual/automatic tasksUser needsLustman 94,CELLAR[p22 Rettig 93]
        Logic GrammarSimple TermsPredicatesAbramson & Dahl 89
        Interpreter PatternTerminalExpressionNonterminalExpressionGamma et al 94(pp243-255)
        Program StructureFunction CallsFunction DefinitionsBaecker & Marcus 90
        CorrespondenceDataProgramsJSP, Appendix 3
        OtherSubprogram callsPathsCampbell & Haberman 74, S/SL
        ?Entries namesAda task structureHemendinger 90
        TheoryTerminals:a..zNon_terminals:A..ZEilenberg 74,...THEORY

        (Close Table)
        BNF is the most successful formal way to writing definitions. So we start with a form of BNF. MATHS uses white space, bars ("|") and the symbol "::=" as in BNF. MATHS adds "#" to mean any number of, including zero. The "<" and ">" of the original BNF are used in MATHS as relations and as parts of arrows and crosses. Parentheses "()" are used as they are in algebra. Here is a comparison of BNF, Ada EBNF and MATHS:

            BNF            <number>::=<digit>|<number><digit> [Naur et al 64]
            EBNF      number::= digit {digit}       [ANSI 83 ADA, ANSI 95 Ada]
            MATHS          number::= digit #digit.
        The Ada EBNF symbols '{}[]' are used for sets and subscripts in MATHS. Ada EBNF { x } is #( x ) in MATHS. [x ] in Ada EBNF is O(x ) in MATHS.

        2.2.2 A Mapping into English

        • "::=" becomes "is"
        • "&" becomes "and also"
        • "~" becomes "but not"
        • "|" becomes "or"
        • "#" becomes "any number of " or "many"
        • "O" becomes "an optional"
        • "(" becomes "a sequence of"
        • ")" becomes ", end of sequence"
        • whitespace between parts of a sequence becomes "and then"
        • "_" becomes a space between words.
        • "..." indicates that the definition will be completed later in the documentation.

        2.2.3 Syntax of a MATHS dictionary

        This section is a dictionary defining a dictionary:
      1. dictionary::= #( definition | comment ). A dictionary has any number of definitions and comments. This sentence is a comment. The line above was a definition.
      2. definition::=simple_definition | ..., Later a more complex kind of definition is added to the simple definitions used in dictionaries and grammars:
      3. simple_definition::= defined_term "::=" expression.
      4. A simple definition has a "::=" between the term being defined and the expression defining it.
      5. expression::=set_expression | .... Initially we are interested in sets of sequences (other expressions are defined in sections 4, 5 and 6 below).
      6. set_expression::= item | union | intersection | sequence | phase. A set_expression is either an item, or a union, or, ...
      7. item::= element | defined_term | "(" set_expression ")" | function "("set_expression")" .
      8. function::= "O" | ... . An item is either an element (defined below), or a defined term, or any set expression in parentheses, or the result of applying a simple function like "O", meaning optional.
      9. union::= alternative #( "|" alternative ). A union has at least one alternative. Alternatives in a union are separated by vertical bars.
      10. alternative::= complementary_form | sequence. Each alternative is either a complementary_form or a sequence.
      11. sequence::= #phase. A sequence is a number of phases. The "number sign" (ASCII code hex(23), hash, sharp) shows the occurrence of any number of the next item - including none. A sequence can therefore be empty, one phase, or two phases, etc..
      12. phase::= ("#" | ) item. A phase is part of a sequence where an item occurs once, or where the same item can occur any number of times including no times at all.
      13. intersection::= item #("&" item ), The ampersand character("&") shows that all the descriptions (items) must hold at once.
      14. complementary_form::= item "~" item, A complementary_form - A~B say - describes the set of As that are not Bs. Here is an example:
      15. comment::= (#character )~definition. Any number characters (in a dictionary ) that are not a definition form a comment. This paragraph is a comment in the MATHS dictionary called Section 2.3.

        2.2.4 A Mapping into Graphics(TEMPO).

        Sequences, unions and iterations map easily into pictures(chapter 7). Here are examples:

        Here is how

         A::=a #(b|c) d
        is shown in TEMPO on a non-graphic device:
                  A
               -------
              |   a   |
               -------
             (| b | c |)
               -------
              |   d   |
               -------
        The graphic representation for an intersection needs a third dimension. Instead we place the items diagonally and put an ampersand("&") at the top left hand corner:

        TBD

        2.2.5 Precedence

        Because union is defined as made of alternatives which, in turn, contain sequences we have:
      16. |-( a b | c d) = ( (a b) | (c d) ),

        The number sign ("#") always applies to the immediately following item and means "a number of (including none)". So if

      17. File::=Header #Record Sentinel, then each file has one Header, then (possibly) some Records, and then one Sentinel. It does not allow Header to recur. When
      18. File::=#(Header Record) Sentinel, there can be any number of Headers but each Header is followed by a Record. Therefore there will be the same number of Headers as Records in this File.

      19. File::=#(Header Record Sentinel), implies that Headers and Sentinels can occur many times - but always with one Record between them. In this File there will be the same number of Sentinels, Records and Headers because the "#" does not permit a part of the repeated sequence to appear without the rest of sequence also being present. Notice that this File can also be empty - with no Header, Record or Sentinel.

        To show that a Record might be missing we use

      20. File::=#( Header (Record|) Sentinel), or
      21. File::=#( Header O(Record) Sentinel).

        To show that several Records can occur for each header/footer pair we would write

      22. File::=#(Header #Record Sentinel) which implies that Headers and Sentinels can occur many times and with zero or more Records between them.

        Similarly, (a | #b) indicates either an a or any number of bs, as does (#b|a). This, plus the precedence of spaces over vertical bars means that a #b c | #b describes a string with a single a followed by many bs and one c or else many bs.

        Parentheses (()) are put around unions within sequences, and iterations.

      23. (a | b) (c | d) = reads "a sequence of a or b, end of sequence and then a sequence of c or d, end of sequence," Or informally
      24. "the first item is either an a or a b followed by either a c or a d."

        Notice that this describes four alternative forms because:

      25. (above)|-(a | b) (c | d) = (a c | a d | b c | b d). Similarly
      26. #(a | b) = = any number of as and bs in some order.
      27. #(a b) = =any number of pairs of one a followed by one b. In #(a b) each a must be followed by a b. Thus #(a b) is proper subset of #(a | b):
      28. (above)|-#(a b) =>> #(a | b).

        2.2.6 Applications

        Section 2 describes an important subset of MATHS and TEMPO. Element and defined_term must be defined to make the dictionary of section 2.3 useful. It can then be used for many purposes in the software process. A similar algebra can be used to specify programs and data bases. When other features are added to MATHS dictionaries we get a complete system of documentation. Section 3 (next) explores syntax description.

      . . . . . . . . . ( end of section 2.2 EBNF & Dictionaries) <<Contents | End>>

      2.3 Syntax Description

        2.3.1 Syntax

        In grammars(syntax) and lexicons(lexical descriptions) the following are used:
      1. element::=character_string | name_of_non_printing_character,
      2. character_string::=quote #( char ~ (quote | backslash) | backslash char)
      3. quote,
      4. quote::="\"",
      5. backslash::="\\",
      6. EBNF_define::=":" ":" "=".
      7. name_of_unprintable_character::=capital_letter #( capital_letter | digit ). The names of unprintable characters of ASCII are well known (NUL,..., SP, CR, LF, ...DEL). In MATHS the character names used in Ada are preferred [Ada LRM and Chapter 3, Section 6.4 ASCII_Code].
      8. identifier::= ( (letter | digit ) #id_character )
      9. & ( #id_character (letter | digit) ),
      10. id_character::=(letter | digit | underscore). An identifier does not start or end with an underscore - it is made of letters, digits and break characters, and must always start and end with either a letter or digit:
      11. defined_term::= natural_identifier & defined |... . When the defined_terms are identifiers and the expressions have no intersections or complements then the dictionary is context free. The three dots (...) indicate that there are other forms that a defined term can take(Chapters 3, 4, and 5). A natural identifier is made up of words and numbers in a natural language:
      12. natural_identifier::= word #(underline word)
      13. & (declared_identifier | defined_identifier | unknown_identifier). Words are either numbers or are made of letters. When made of letters they must be in a dictionary - either of a natural language or a special dictionary for the project,
      14. word::=number | ( (letter #letter) & in a dictionary ),
      15. number::=digit #digit,
      16. digit::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9". The MATHS rules for creating identifiers are discussed in section 6 of this chapter and defined with increasing rigor in chapter 3 and 5. Meanwhile note that it is common in MATHS to have several names with the same meaning - typically there is a short mathematical variable which is used in formulae plus a longer natural_identifier describing the meaning in non-mathematical terms.

        2.3.2 Traditional Semantics of EBNF

        (This section can be omitted on first reading)

        The meaning of EBNF is that each non_terminal and each set_expression represents a set of strings of characters. MATHS contains EBNF as a special case. A context free grammar(CFG) is a grammar without ("&") and ("~"). Here is an informal introduction to the problem of defining what a dictionary means (see chapter 3 for a more formal treatment). This section introduces ideas that reappear later in other contexts.

        First we need to define what we mean by "string" and "sets of strings." Several formal models of strings have been translated into MATHS [Appendix 1.Theories of strings]. In all theories, strings come from an empty string by concatenating symbols. Concatenation of strings is shown by an exclamation mark (!). It is an associative operation with a unit ( the empty string "" ). Each string, by definition, has a unique sequence of atomic symbols, which when concatenated generate the original string. Temporarily we symbolize strings by lower case variables and sets of strings by upper case variables. We next define operations of union(|), intersection (&), complementation (~), concatenation() on sets of strings by:

      17. A B::={c || c=a ! b and a in A and b in B}, -- the set of all c where c=a!b and a is in A and b is in B.
      18. A & B::={c || c in A and c in B},
      19. A | B::={c || c in A or c in B},
      20. A ~ B::={c || c in A and not ( c in B) }. Note. In MATHS we would code the above using:
      21. a in A::= a A, {x || W(x)} ::={c || W(x)}. [Section 5.3 of this chapter, and section 4 of chapter 4]

        A string is treated as a singleton set when it is in a context that needs a set of strings:

      22. For string c, c::sets_of_strings={c}. In this section I stands for the set of the empty string:
      23. I:sets_of_strings={""}. The definition of (#) is interesting and is the simplest example of the kind of reasoning used to define the meanings of more complicated systems. We could write,
      24. #A = I | A | A A | A A A | ... . but this formula is not formal enough to be used by a proof checker since ... is not defined. We could define the powers of a set (A^n) and then write an infinite union - but first we would have to define these ideas themselves. So we start from the property
      25. #A = I | A #A and define #A as the smallest solution to the equation X = I | A X:
      26. #A = the smallest { X || X = I | A X }.` The solution can be shown to be the smallest set of strings that contains the empty string and is unchanged when A is concatenated with it. Further Kleene showed that by repeatedly applying the operation that replaces X by `( I | A X) we get closer and closer to #A. So #A` is the smallest value of X solving:
      27. X::=I | A X . Kleene's model works for any grammar. For instance, the grammar `L::= a L b | "". has one definition, one defined_term {"L"} and two elements {"a", "b"}. We define the meaning of "L" , to be the smallest set of strings M` such that the following equation is true:
      28. M = a M b | I. So
      29. M= {c || for some x in M (c =a ! x ! b)} | I. The above gives a recursive process testing whether a string lies in M. It does not tell us how to find elements in M. By Kleene's work M is also the smallest fixed-point of the mapping g, where
      30. g(X) = a X b | I. So to find M we use an iterative process to generate a series of sets:
      31. S(0) = {},
      32. for all natural_numbers n, S(n+1) = g( S(n) ). The series gives successive approximations to M:
      33. S(0)={},
      34. S(1)= I | a{}b = I ,
      35. S(2)= I | a I b = {"","ab"},
      36. S(3)= I | a{"", "ab"}b = {"", "ab", "aabb"},
      37. S(4)= I | a{"", "ab", "aabb"}b = {"", "ab", "aabb", "aaabbb"},
      38. S(5)= I | a{"", "ab", "aabb", "aaabbb"}b = {"", "ab", "aabb", "aaabbb", "aaaabbbb"},
      39. ... It is intuitively obvious that these sets are tending toward a limit. The limit looks like

        The above value satisfies the fixed point definition of M = g(M). Any subset does not. The reason that the Ss seem to get closer to M is that we have to look at longer and longer strings to find a string in M that is not in S(i) as i tends to infinity. Using a limit to define the meaning of grammar has been around in the literature for a long time - but is not stated in many texts(bibliography in section 6.7 of chapter 3). The fixed point definition came later and similar to Scott's denotational semantics [ACM 86]. In chapter 3 we will have the notation to make the above more formal. Similar techniques are appearing across a wide range of research in software engineering and computer science [ Example: Tsai Wiegert & Jang 92]. We will return to fixed points and recursion in chapters 4 and 5.

        Note that the operations on strings and sets of strings do not normally need defining in a MATHS document. They are part of a standard package of operators that are predefined [See Appendix 4 and Chapter 4, section 8].

        2.3.3 Files and Data Directed Design

        Abstraction gives models that can be reused. Any system that fits the axioms fits the abstract model. To use the idea of string all we need is an empty object and a concatenation operator. For example if T stood for a set of records and #T for files, then the empty string models the empty file (UNIX /dev/null) and concatenation models writing one file after another (UNIX: `cat a b >c`). Consider
      40. magnetic_tape::=label #file end_of_tape_marker,
      41. file::=header #data_block sentinel.

        These define the magnetic tape's physical structure. If we were designing a program to process a file of enrollement records we might describe a file sorted by sections as follows:

      42. file::=#section,
      43. section::=class_header #student,
      44. student::=A_student | student~A_student.

        Strictly speaking a definition of form A::= B | A~B is a theorem `|- A = B | A~B`.

        If the domain has the words "file", "section" and "student" we use a similar structure, even if the class header is absent and a student record identifies the student's class instead. The result is called a logical structure.

      45. file::=#section,
      46. section::=#student,
      47. student::=A_student | student~A_student.

        If the problem is to summarize the data on each section then we might show the output as:

      48. output::= header_page #section_summary college_summary, and note that each section corresponds to exactly one section_summary and the input corresponds to the output

        It is easy to combine these into one structure and map it into a program structure:

        The structure above becomes a program when operations and conditions are added (See Section 6.5 below and DDD in Chapters 2, 6, and 9).

        Notice that the dictionary developed while designing a program must document all the important structures of the data - both the physical and the conceptual. Thus semantic and pragmatic terms are defined to clarify the meaning and use of data. This in contrast to the data dictionary developed by an analyst or system designer. These must be free of semantic and pragmatic structures because a file can be used for many purposes and hence is parsed into different structures. Typically:

      49. free_file::=# record,
      50. record::=type1 | type2 | type3 | ..., Now if a,b,c,... are elements then
      51. F::=#(a | b | c | ... ), is the free semigroup or unstructured structure. In categorical terminology this is a free, initial, or universal repelling object in the category of semigroups with units. Free dictionaries are the weakest useful dictionaries.

        Data dictionaries and TEMPO diagrams can solve a problem pointed out by Hoare in 1973: How to show a non-programmer what a program is expected to do. The canonical grammar or dictionary shows the user or client that we understand all their terminology. Canonical diagrams and dictionaries show how the user's data is handled in terms that the users and clients understand[Katzenberg & Piela 93]. The test of a canonical dictionary is that clients and users certify it as complete and accurate. Canonical dictionaries are the strongest useful dictionaries.

        2.3.4 Dynamics

        The earliest use of the regular operators (sequence, selection, and iteration) was to express the dynamics of a complex but finite systems[Kleene]. Several recent methods use similar models to help analyse and design software [ DAD, Robinson 79, JSD, SSADM, Zave 85, Botting 86a, Naik & Sarikaya 92, Shlaer and Mellor 92, MERODE by Dedene & Snoeke 94]. This section now presents a simplified example of how MATHS formula are used to document the changes in an entity (a real object) in order to deduce the code and/or state space of an object. The details are in chapter 6.

        Consider a button on a control panel. There are two types of event that can effect it - it may be pushed - requesting a service, and it can be serviced by some means or other. At first glance we have:

      52. Button_life_history::= #(Pushed Serviced), but switches bounce and humans beings push buttons several times while waiting. So the significant pattern of events in the life of one button is defined by the following grammar:
      53. Button_life_history::= #Request,
      54. Request::= Push #Push Service. Notice that part of the life history documents a particular type of scenario: The user requests that the something happens, and waits until satisfied [cf Lindland Sindre & Solvberg 94]. The corresponding program structure would be like this(Ada-like pseudo-code):
      55. Button: loop
      56. Request: loop
      57. accept push do skip; -- < initiate request >
      58. end accept;
      59. accept event do event_type:=<something>;
      60. end accept;
      61. loop while event_type \= service
      62. accept event do event_type:=<something>;
      63. end accept;
      64. end loop;
      65. --- <Service the button>
      66. end Request;
      67. end Button; Jackson showed how task types can be implemented in traditional high level languages like COBOL in 1975[Appendix 2]. Automata theory shows how to turn the regular grammar into the kind of state determined object popular in most object based methodologies [Lindland Sindre & Solvberg 94].

        2.3.5 Conclusion

        A dictionary or grammar based on EBNF can be used for a great many purposes. However it has well-known limitations. Much of this monograph is about the minimal changes to make EBNF into a general tool in software engineering. The next section (section 4) uses an example to introduce a new way to look at a EBNF definition - a statement like term::=formula is seen as doing two things - (1) declaring a variable (term::type) and (2) postulating that term's value is equal to the value of the formula(term=value).

      . . . . . . . . . ( end of section 2.3 Syntax Description) <<Contents | End>>

      2.4 An Example of Formal Analysis

        2.4.1 Introduction

        This section illustrates how MATHS documents "Reality" directly. Simple mathematics - sets, mappings, relations, definitions, assertions, etc.- can define important ideas: "An application domain system model of the equipment and its intended environment of use is defined. This is a dynamic system model, and defines the overall states of the system as functions of time. Requirements are constraints on this model" [Hansen Ravn & Rischel 91 ]. "the study of a system for the purpose of understanding and documenting its essential characteristics"[...]"All systems are embedded, either in the way we normally think about embedded computer systems or are in an organisation such as a business[so] the larger environment should always be considered, especially for analysis, which emphaiszes the comprehension and documentation of entire systems."[Embley et al 95] The illustration is an analysis of a set of elevators. Several conferences, papers, and methodologists have used the problem of designing a controller for a set of elevators and present a variety of plausible designs. See Jackson 83, Wing 88, Chang 89, Mili et al. 89, Papelis & Casavant 92, and many others. It is called the "Lift Problem" in the literature. Some designers remark on the problem of evaluating their design[Fickas & Helm 92 ]. Implementation steps have also been covered [ for example: Botting 87, and JSD+ADA in Yeung & Topping 90, Meyer 93 figures 14..17]. This section derives some of the knowledge needed to validate designs, specifications, and implementations.

        MATHS documents are a mixture of commentary and formal statements[cf McDermid 89 p136]. I also include some commentary about the notation that would normally be omitted.

        2.4.2 The Lift Problem

        [ 02_MATHS_AND_TEMPO.example.html ]

      . . . . . . . . . ( end of section 2.4 An Example of Formal Analysis) <<Contents | End>>

      2.5 Logic, Sets, and Structures

        2.5.1 Introduction

        The above section showed how MATHS encodes mathematical models. This section of this chapter describes the logic of MATHS. Section 5 makes ready for chapters 3, 4 and 5.

        2.5.2 Propositions and Predicates

        A proposition can have two possible meanings - true and false. In MATHS the symbol @ represents the Boolean type with two standard values {true, false}. The following table defines the operators - listed with the highest priority operator first.
        Table
        ExpressionValuesSyntaxSemantics
        Ptrue false-
        not Pfalse trueprefix@->@

        (Close Table)


        Table
        ExpressionValuesSyntaxSemantics
        Ptrue true false false
        Qtrue false true false
        P and Qtrue false false falseinfix(serial)@><@->@
        P or Qtrue true true falseinfix(serial)@><@->@
        P iff Qtrue false false trueinfix(parallel)@><@->@
        if P then Qtrue false true truespecial@><@->@

        (Close Table)

        Surprisingly, unlike most programming languages, mathematics has two kinds of infix operator. MATHS follows mathematics:

      1. Serial: P1 and P2 and P3 = (P1 and P2) and P3
      2. Parallel: P1 iff P2 iff P3 = (P1 iff P2) and (P2 iff P3).

        Notice that when predicates get complicated then it pays to be able to display them in the form of tables and/or charts - This has proved good for logic circuit design (Karnugh maps), Data Processing (Decision tables), and formal specifications[Leveson et al 94]:

        A predicate is a proposition that has variables, quantifiers and an equality relationship. MATHS spells quantifiers out and includes numerical quantifiers like "one", "no", "0..1". The syntax attempts to make the writing of formal statements as natural as possible. In the following X is a set or a type, x is a variable and W(x) a proposition containing x in a number of places. for all x:X(W(x))::=For all x of type X, W(x) is true, for x:X(W(x))::= for all x:X(W(x)), for no x:X(W(x))::=for all x:X(not W(x)), for some x:X(W(x))::=not for no x:X(W(x)).

        for 0 x:X(W(x))::= for no x:X(W(x) ), for 1 x:X(W(x))::= for some x:X(W(x) and for 0 y:X(x<>y and W(x)), for 2 x:X(W(x))::= for some x:X( W(x) and for 1 y:X(x<>y and W(y)) ). for 3 x:X(W(x))::= for some x:X( W(x) and for 2 y:X(x<>y and W(y)) ).

      3. . . . for 0..1 x:X(W(x))::=for all y,z:X( if W(y) and W(z) then y=z),
      4. . . .

        2.5.3 Sets

        For any type T there is an another type symbolized by @T with objects that are sets of objects of type T. In MATHS a subset of a type is not a type, but a subset (or set for short) does belong to a particular type and so determines the type of an element in it: If a set A is of type @T then As elements must be of type T. It is therefore unambiguous to use sets to declare the type of a new variable [compare sub-type in Pascal and Ada, also see Brinch Hansen's critique of Pascal]. In MATHS the symbol for the type is also symbol for the universal set of that type. It can also be used to indicate the type of a value -
      5. if e can have values of different types then T(e)=(e).T is the value in T.


        Table
        ExpressionMeaningType Example.
        Sets
        {}, ∅null or empty setAmbiguous
        Tuniversal set @TNumbers
        {e1} = $(e1)singleton @T (e is of type T){1}
        {e1,e2,e3,...} extension @T (e1,e2,...are type T) {1,2,3}
        {x:T || W(x) } intention @T{n:Nat||n*n=4}={2}
        ${D} Set satisfying documentation D[See section 5.5].
        S1 & S2, S1 ∩ S2 intersection@T><@T -> @T..n&m..=n..m
        S1 | S2, S1 ∪ S2union@T><@T -> @T..n-1 | m+1..
        S1 ~ S2complement@T><@T -> @T= Nat ~(n..m)
        Predicates
        e in S1, e ∈ S2,
        e is in S2 membershipT><@T -> @ 1 in 1..
        S1<> S2, S1 \noteq S2inequality@T><@T -> @1..<>0..
        S1 = S2 equality @T><@T -> @1..2={1,2}
        S1==>S2, S1 are S2,
        S1 ⊆ S2 subset or equal@T><@T -> @1..2==>1..3
        S1=>>S2, S1 ⊂ S2 proper subset @T><@T -> @1..2=>>1..3
        S1>==S2S1 partitions into S2@T><@T -> @{1,2,3}>=={{1,2},{3}}
        S1S1 is not empty@ (overloaded) ..n+1&n-1..
        no S1S1 is emptySee below( Q A)no ..n-1 & n+1..


        (Close Table)
        We can use a subset of a type anywhere a type could be used, because a subset determines the type of the elements in it. For example if A is a subset of T
        (A:@T): for all x:A(W)::=for all x:T (if x in A then (W)), for some x:A(W)::=for some x:T ( x in A and (W)). In English and mathematics the common idiom puts a type name before the variable - for example:

      6. For all real x>0 there are two real y's such that y^2=x, or in MATHS
      7. for all Real x where x>0, 2 Real y where y^2=x, meaning
      8. for all x:Real (if x>0 then for 2 y:Real (y^2=x) ). Another useful shorthand is like There are two elevators and five floors. The MATHS form is 2 Elevators and 5 Floors. In general if A is any subset of a type T, and Q is any quantifier(all,some,none,...or a number ):
      9. Q A::= for Q x:T (x in A). Thus|-no A iff A={},
      10. 1 A iff for some a:T(A={a}), . . .

        2.5.4 Lists, strings and n-tupls

        Since lists, string and n-tuples have been useful in many projects and theories, MATHS has them. They basic model uses maps from subranges of the integers.
      11. For T:Types, %T::Types= { x: (n..m)->T || for some n,m:Integer },
      12. n..m::@Integer={i:Integer || n<=i<=m}. The explicit notation for an element of %T based on 1..n is
      13. (e1,e2,e3,e4,...en) however a list is a map so that we can express a list like this
      14. fun[ i:1..n](ei). Because lists are maps we can invert them, which gives a neat way to attach numbers to a finite collection of things, for example to express the priorities of operators:
      15. priority= /("not", "and", "if", "or", "if__then__"). Lists are manipulated by set of maps from %T into T are defined with the obvious meanings: 1st, 2nd, 3rd, 4.th, ..., n.th, ... last plus a 'prefix/postfix' operation(?) that generates all lists/n-tupls:
      16. For t:T, l:%T, t?l and l?t in %T.
      17. For t:T, l:%T, t=1st(t?l) and l= rest(t?l) and t=last(l?t) and l=front(l?t). %%T (lists of lists), are NOT lists of elements:
      18. ((1), (1,2), (1,2,3)) is a different type of object to (1,2,3). Chapter 6 discusses how we can model LISP-like tree structures.

        Strings are also part of MATHS. #T is the type of object generated by concatenating objects of type T together. They can be modeled by lists and inherit the operations, properties, and predicates acting on lists. In addition

      19. for any x,y:#T, x!y in #T and (!) is an associative operation with a unit (). So strings=#char and files=#record, etc. If T is the type of elementary objects then #T is the "free" semigroup. Unlike lists, ##T=#T. Here is a sample of the kind of definitions are possible:
      20. filter::(@T>< #T)->#T, q::#T->#T.
      21. For t:T, x:#T, A:@T, a:A, b:T~A, filter(A, ())::=(), filter(A, (a!x))::=a!filter(A,x), filter(A, (b!x))::=filter(A,x), q(())::=(), q((t))::=(t), q(t!x)::=q(filter(/<(t), x) ) ! filter({t}, x) ! q(filter(/>(t), x) ). The function q is from Wainright's Miranda examples[page 148 in Wainwright 92].

        2.5.5 Structures

          Computer science needs a model of structured objects and data, etc.. The simplest structure is a set of numbered n-tuples - the Cartesian Product:
        1. @(A,B,C,...)=@(A><B><C><...)={(a,b,c,...)|| a:A, b:B, c:C, ...} Modern formal notations from Ada to Z provide means for creating a new `type of object` by (1) listing components and (2)specifying constraints and operations on these components. In MATHS any set of declarations can be interpreted as defining a set of labeled tupls. If D is a collection of declarations, definitions, axioms, theorems and comments then Net{D} is called a set of documentation. A dollar sign converts it into a set of tuples. Net{D}=${D} is the set of objects that satisfy D. If Net{D} is called N then N means the same thing as Net{D}. For example: (Times, <) in LOSET.

          2.5.5.1 Labeled Tuples

          The formal definition of MATHS structures requires techniques that are not defined until chapter 4. Here are some examples of tupls, sets of tupls and sets of sets of tuples. You can think of these as records, tables and data bases in a programming system respectively. You can think of each as a small spreadsheet that automatically keeps itself uptodate whenever any cell changes. You can also think of them as objects. Bear in mind that programs are finite implementations however. Formal definition is postponed until chapter 5.

          Tupls look like this: (a=>xa, b=>xb, ...). They are in a type defined by one of the following ways:

          Short forms

        2. S::=${ a:A, b:B, c:C, ... },
        3. or S::=Net{ a:A, b:B, c:C, ... },
        4. or N::=Net{ a:A, b:B, c::C, ... },
        5. S::=N.

          Long definition

        6. TBD Example
        7. phones::=${name::Strings, number::Strings}.
        8. |-(name=>"John Smith", number=>"714-123-5678") in phones.

          Each label in a MATHS structure can be used to extract components from tupls with the dot operator of Ada, C, Pascal,..., and Z:

        9. (name=>"John Smith", number=>"714-123-5678").name="John Smith"
        10. (name=>"John Smith", number=>"714-123-5678").number="714-123-5678" The dot fits with mathematics since '.' projects a vector into a scalar. MATHS formalizes projections as functions mapping structured objects to their components. It has both f(x) and x.f notations [Knuth 74]. So if x in X and X=${a::A,b::B, c::C,... } then
        11. a in X->A. a(x)=x.a= the a of x.
        12. b in X->B. b(x)=x.b= the b of x.
        13. c in X->C. c(x)=x.c= the c of x. Hence the TEMPO diagram:

          Because projections are maps, they are relations and so have converses. So for eaxmple

        14. y./a= { x || x.a=y }. MATHS also defines projection operations onto lists of elements:
        15. x.(a,b) = (x.a, x.b) in A><B,
        16. x.(a,b,c) = (x.a, x.b, x.c) in A><B><C.

          One way to imagine a MATHS structure is to think of it as a record or object in a high level language or a labeled tupl in a RDBMS. The formal definition implies a collection of maps. Thus phones=Net{name,number::Strings} implies name in phones->String and number in phones->String. Formally, phones is the universal (free, initial, or universally repelling) object in the category of types with names and numbers. In other words: all types with two functions called name and number mapping objects into the Strings, is modeled by phones, because for any such type there is a function that maps phones into it without loosing the information about names and numbers. Therefore phones encodes all sets of objects with names and numbers(details in Chapter 5).

          2.5.5.2 Subclasses of a structured type

          Given the idea of a tupl and a structured type it is easy to allow a notation for collections of tupls that satisfy a given constraint - for example the set of phones with prefix "011-" might be written as Net{name,number::String, number in "011-" String. } In MATHS Net{a::A, b::B, ... z::Z, W(a,b,...z),...}= {(a=>u,b=>v,...):${a::A, b::B, ... z::Z) || for some u:A, v:B,... (W(u, v,...)... ) }
        17. The following are not the same:
        18. {x:A || W(x) } Set of objects in A satisfying W [Chapter 4].
        19. {x in A, W(x) } Set with Boolean elements, either {true},{false}, or {true,false}. Net{x::A, W(x) } Documentation with variable x in set A and assertion W(x)[Chapter 5]. Net{x::A, W(x) } Set of objects (x=>a) where a is in A and W(a) is true.

          2.5.5.3 Sets of Objects

          Given a set or class X in @T then we can talk about its subsets (@X). Finite sets of similarly structured elements are familiar to computer people as files, tables, and lists and are written finite@X. For example:
        20. Suppose phone_data::= finite@PHONE
        21. where PHONE::=Net{name::Names, number::Numbers, place::{home,work} },
        22. Names::@Strings,
        23. Numbers::@Strings. So if P={ (name=>"John Doe", number=>"714-123-5678", place=>home),
        24. (name=>"John Doe", number=>"3678", place=>work),
        25. (name=>"Joan Roe", number=>"714-123-5878", place=>home),
        26. (name=>"Joan Roe", number=>"3878", place=>work)
          }
        then P in phone_data. Further name(P) = P.name = the names in P ={"John Doe","Joan Roe"}, and ("John Doe",work)./(name,place).P.number="3678".

        In general a property of the components defines a set of objects:

      22. P(name="John Doe")={(name=>"John Doe", number=>"714-123-5678", place=>home),
      23. (name=>"John Doe", number=>"3678", place=>work)}. But a single object can often be identified by specifying enough properties:
      24. |-the P(name="Joan Roe" and number="714-123-5878")=(name=>"Joan Roe", number=>"714-123-5878", place=>home), and
      25. |-the P(name="John Doe" and place=home) =(name=>"John Doe", number=>"714-123-5678", place=>home).

        Codd has shown that any set of data can be expressed as simple structured tables like the above. Thus we can model all data as tuples. A collection of sets of tuples is a relational data base and forms a category. The objects in an object oriented design also forms a similar category [Freedman 83]. Adding constraints to sets of tuples proves to make specifications much simpler[ Z, Gal & Etzion 95]. Structures form a conceptual model of the area being studied that is readily implemented using modern technologies such as data bases, logic programming, and objects[Chapter 5].

        2.5.5.4 Functional Dependencies And Keys.

        Notice that P above satisfies the requirement that we can look up a number. Given the items name and place they determine a unique number. We document this fact by writing:
      26. P in finite@PHONE (name, place)->(number).
      27. the P(name="John Doe" and place=work)=(name=>"John Doe", number=>"3678", place=>work) We can define the phone books as those finite sets in @PHONE that relate a name and place with one number:
      28. Phone_books::=finite@PHONE(name, place)->(number). Identifiers or keys like (name, place) appear in all computer applications. Further Codd's work suggests that all data can and should be thought about and manipulated as a collection of such sets[ DATA].

        2.5.5.5 Binary Relationships

        A special kind of relationship is one with two components - equivalent to a set of 2-tuples. They are turn up in all formal and natural settings:
      29. x above y
      30. x > y
      31. x = y
      32. x owns y
      33. x is_the_mother_of y The set of all relations between sets A and B is written @(A,B). A relation is typically defined by a formula like the following:
      34. For x:A, y:B, x R y::@= W(x,y), where W(x,y) is a well formed formula that returns {true, false}. There is also a formula for R it self:
      35. |-rel [x:A,y:B](x R y) = R. We can also express relations as sets of pairs or as boolean functions depending on the context:
      36. {(x, y):A><B || x R y },
      37. map [x:A,y:B](x R y),
      38. (1st)R(2nd).

        Binary relations lead to a language for program specification and design -- see sections 6.3 and 6.4 below.

        2.5.5.6 Readability

        MATHS has ways of making relationships with more than two components readable [compare Mark & Cochrane 92]. First, we can declare n-ary relationships as a binary relation between actors/subjects and structures.
      39. pays::@(person, ${it::money, to::person}).
      40. gives::@(person, ${it::object, to::person}).
      41. sells::@(person, ${it::object, to::person, at::money}).
      42. buys::@(person, ${it::object, from::person, at::money}).
      43. For a,b:person, p:object, m:money,
      44. a buys(p, from=>b, at=>m)::=b sells (p, to=>a, at=>m),
      45. b sells(p, to=>a, at=>m)::= a pays (m, to=>b) and b gives (p, to=>a). The expressions then simulate normal English: a buys (p, from=>b, at=>m). There is a mild abuse of notation in that the label "it=>" is omitted [Chapter 4 Logic, Section 7.3 Abuses of Notation]. The notation is similar to Smalltalk key-word messages: a buys: p from: b at: c or Grady Booch's Ada. Second, the syntax of natural predicates is more readable(Chapter 4). As an example:
      46. For a,b:person, m:money, p:object, (a buys p from b at m)::=a buys (it=>p, from=>b, at=>m). This is unambiguous if the brackets ar not omitted and the intervening words not duplicated. Further generalization on a "natural" and/or "distfix" syntax is speculative [Appendix 3, section 3].

      . . . . . . . . . ( end of section 2.5.5 Structures) <<Contents | End>>

      2.5.6 Example - A University

      Here are two models of a college or university that illustrate the above ideas. The first is an early attempt at a formal analysis and the second the result of many revisions.

      After some more analysis the model became: [ UNIVERSITY in rjb93b.discrete ]

    1. UNIVERSITY::=
      Net{
        Sets
      1. P::Finite_Sets=People.
      2. S::@P=Students, |- S in Finite_Sets.
      3. F::@P=Faculty, |- F in Finite_Sets.
      4. C::Finite_Sets=Courses:Finite_Sets,
      5. E:::@C=Sets_of_electives, |- E in Finite_Sets.
      6. N::Finite_Sets =Sections,
      7. D::Finite_Sets=Departments,
      8. Q::Finite_Sets=Qualification,

        Maps

      9. d::=dept::C->D. Each course is offered by one dept.
      10. d::=dept::Q->D. Each qualification belongs to a department.
      11. c::=course::N->C. Each section covers one course.
      12. |- (=): S->P. Students are people.
      13. |- (=): F ->P. Faculty are people.

        Relations

      14. T::@(F, N)=Teacher. For f,s, f T s::=f teaches s.
      15. L::@(S, N)=Enrollment. For t,n, t L n::=t is enrolled in n.
      16. M::@(F, D)=Member. For f,d, f M d::=f is member of d.
      17. V::@(E, Q)=ElectiveSet. For e,q, e V q::=e is set of electives in q.
      18. W::@(S, Q)=Declared. For t,q, t W q::=t has declared major q.

      19. R::@(C, Q)=Required. For c,q, c R q::=c is required for q,
      20. c R q::={c} V q. |- in:@(C, E). For c,e, c in e::=c is an elective in e.

        Properties For f:F, s:N, if f T s then f M d(c(s))

        Conceptual Model

        (Each vertical line is a mapping - a "has-a" relation. "n" represents a range of numbers yet to be determined. '+' shows that the lower class inherits all the properties, etc. of the upper one, an "is-a" relation)

          People---------------------------
            |1                             |1
            +                              +
            |0..1                          |
         Faculty   Department              |
          |1  |1   |1 |1   |1              |
          |   +    |  |    |               |
          |   |0,1 |* |    |*              |0..1
          |   Member  |   Qualification  Student
          |           |          |1   |1  |1  |1
          |           |          |    |*  |*  |
          |           |*         |   Declared |
          |        Course        |*           |
          +        |1    |1  ElectiveSet      |
          |        |     |       |1           |
          |        |*    |*      |*           |
          |     Section   --in---             |
          |     |1    |1                      |
          |0,1  |*    |10..200                |0..10
          Teacher      -----Enrollment--------

        Standard technology and techniques can be used to implement sets and mappings - using (1)files, (2)a DBMS[ DATA], or (3)data structures plus OOP[ OOP]. An alternative is to treat each entity and relationship as a functor or predicate in a logic programming system[Keller 87, PROLOG]. However classes can not be derived from the static analysis above. It should now be extended by a study of the dynamics of the entities and relationships so that object-oriented classes (with both attributes and methods) can be specificied. Two examples of this were given for a push button in sections 3 and 4 above. Further an inheritance structure often needs to be worked out in detail in complex models to make the logical desin and physical implemntation more reusable - this is covered in section 6.2 next.

      . . . . . . . . . ( end of section 2.5 Logic, Sets, and Structures) <<Contents | End>>

      2.6 Special Techniques

        2.6.1 Introduction

        MATHS is able to express:
        1. 2.6.2 SubClasses and Subtypes
        2. 2.6.3 Program Specifications
        3. 2.6.4 Binary Relations
        4. 3.6.5 Serial Data
        5. 2.6.6 Documentation
        6. 2.6.7 Names
        7. 2.6.8 Expressions
        8. 2.6.9 Abstraction

        These are covered in this section.

        2.6.2 SubClasses and Subtypes

        Commonly a set of entity types form a classification or taxonomy. In this an entity type (genus = class = Ada type = Frame) has different forms( species =subclasses=Ada Subtypes = Special cases). We can use the idea of a set theoretic partition to formalize it:
      1. (gender): animals>== {male, female}. -- animals are either male or female, but not both
      2. (liveness): animals>== {live, dead}. -- animals are either alive or dead, but not both animals>=={ mammals, reptiles, insects, ... },
      3. mammals >=={cats, dogs, horses, ...}, So, for example:
      4. shadow in female & dogs & live. -- Shadow is a living female dog.
      5. (gender)|-no male & female.
      6. (gender)|-animals=male | female. This is exactly equivalent to labelling the elements in the general set as feminine or masculine say. Mathematically we can give a map or function from the elements to a set of labels:
      7. gender::animals->{masculine, feminine}, Similarly from the other partitions:
      8. class::animals->{mammal, reptile, insect,...},
      9. species::mammals->{feline, canine, equine,...},
      10. alive::animals->@. So,
      11. shadow.(species,gender,alive)=(canine,feminine,true). Partitions and labels are equivalent because if f maps A to B then the set { b./f || b:B} is a partition of A and so,
      12. cats = feline./species, dogs=canine./species,...
      13. male=masculine./gender, female=feminine./gender
      14. alive(live)={true} and alive(dead)={false}. Several labels {class, gender, alive,...} often appear in the definition of the set:
      15. animals::=Net{ .... class::{mammal, reptile, insect,...}, gender:: {masculine, feminine}. alive::@,...}. We can abstract these, name them, and re-use them:
      16. ANIMATE::=Net{alive::@, birth::Times,...}
      17. animals::=Net{ .... class::{mammal, reptile, insect,...}, gender:: {masculine, feminine}. ANIMATE }.
      18. MAMMAL::=Net{ ANIMATE, WARM_BLOODED, VIVIPAROUS. class=mammal....}.
      19. WARMBLOODED::=Net{...}, VIVIPAROUS::=Net{...}. Now, there are several species of mammal: MAMMAL(cat), MAMMAL(dog), ... . We can get this effect by declaring a Boolean variable and use it as a condition
      20. MAMMAL::=
        Net{
          ANIMATE, WARM_BLOODED, VIVIPAROUS. class=mammal.
        1. species::{feline, canine, equine, ...},
        2. cat::=(species=feline), dog::=(species=canine),
        3. horse::=(species=equine),...

        }=::MAMMAL.

        So for x:MAMMAL, cat(x) iff species(x)=feline, see Chapters 5 and 6 for details.

        In frame theory there are normal and abnormal forms[Tsai Wiegert & Jang 92, Kaindle 93]. In MATHS it is possible to develop two pieces of documentation for each case and keep what is common to both separate, for example see the theory of ORDERS in Appendix 1.

        The somewhat confusing concept of inheritance found in object-oriented programming seems to be an attempt to implement these classifications. In chapter 5 I define inheritance as a relationship between different types (not between objects, nor between subsets of the same type). One Type inherits from another if the signature of the first is a subset of the signature of the second, and for all other types, if the second type appears in the signature then so does the first type. Using inheritamce can lead to a simpler diagram in complex systems. It is also claimed to lead to more re-usable code[Gamma et al 94].

        2.6.3 Program & System Specifications

        Relations behave like the operations and conditions in a programming language because each operation in a program(or an object!) defines a relationship between the current state and the possible future states. If there is always a single next state then the relation is deterministic. By composing deterministic steps carefully we get an algorithm. In turn an algorithm can be coded for input to a suitable translator giving an executable program. Alternately we can start from a nondeterminstic relation that describes a set of acceptable behaviors and let the implementor selects a particular subset as an implementation [compare with Kowalski, Hoare 87].

        MATHS relations can be written in a form that makes them look like statements in a programming language. Some syntactic sugar is defined for while/for/loops/. Here follows an introduction. Formal treatment comes later when we have the techniques [Chapters 5].

        The key definition is that of a dynamic predicate. A predicate is a proposition that has free variables. A dynamic predicate has free variables with primes (') attached to them. Here is an example:

      21. x'=x+y After the statement, the value of x is equal to the sum of the old values of x and y. Alsoy is unchanged. It could be coded as x:=x+y in Ada/Pascal/... The rule is that "a prime means the new and no prime means the old" [from Ross Ashby 56, compare Part II of Lam & Shankar 90, Hehner 84a & 84b, Guttag & Horning 91, Steward 93]. Formally x'=x+y defines a relation on a structured set with variables x and y. So
      22. x'=1 guarantees that x is 1 afterwards and also that no other variables change.
      23. x'=1-x means that the value of x will be complemented - its new value will be the result of subtracting its old value from one while all other variables stay the same.
      24. x=1 has no primed variable. It means that x has the value one, and does not change - all variables are left untouched. It is a relation that holds between identical states when and only when x has value 1. It filters out all states when x is not 1. It allows computation under a condition. We call it a condition therefore. It can be shown that it is a condition in the sense of Arbib and Manes [87].

        Here is a more general statement of the MATHS interpretation of a predicate that contains variables with primes. Consider a simple state space S which declares m variables x1, x2,...x[m] of types X1, X2, ...X[M]:

      25. S::=Net{x1::X1, x2::X2, ..., x[m]::X[m]}. Let P(y1',y2',y3',...y[n]',x1,x2,x3,...x[m]) be a predicate with n distinct primed variables(y1, y2,...y[n]) which may also appear without primes as xs. For example in x'=x+y, x1=x, x2=y, and y1=x.

        The relation defined by P(y1', y2', y3', ...y[n]', x1, x2, x3, ...x[m]) is found by:

      26. (1) Replacing primed variables (y1,...) by values in the new state(s2), and
      27. (2) Replacing non_primed variables(x1,...) by values in the old state(s1), and
      28. (3) Adding the requirement that the static (unprimed) variables do not change. So:
      29. P(y1',y2',y3',...y[n]',x1,x2,x3,...x[m])::=
      30. rel [s1:S, s2:S](
      31. P(s2.y1, s2.y2, ..., s2.y[n], s1.x1, s1.x2, ...,s1.x[m]
      32. and for all i:1..m where x[i] not in {y1, y2, ..., y[n]}, s2.x[i]=s1.x[i]
      33. ).

        A more rigorous definition will be worked out in Chapter 5 meanwhile here are some examples. To specify a process that solves two simultaneous linear equations we write

      34. a*x'+b*y'=c and d*x'+e*y'=f. To specify a process that solves a quadratic equation we write
      35. a*x'^2+b*x'+c=0. On the other hand
      36. a'*x^2+b'*x+c'=0, specifies that x is left alone and a, b, and c are adjusted to fit the equation.

        Dynamic predicates can be used with any type of variable. The Programming Research Group at Oxford University, England has shown how dynamics can be simply specified as changes to sets[ Z]. Sets specify the possible data bases. This can be done in MATHS. Here is an example from the Elevator system(section 4 above):

        2.6.4 Binary Relations

        A relation in R:@(T1,T2) can be defined by
      37. For x:T1, y:T2, x R y::@= W(x,y). or equivalently by
      38. For R::@(T1,T2)= W((1st), (2nd)). /R is the converse relation to R. For example if (_^2) is the 'square' function then /(_^2) is the square root relation. Similarly the arc sine function is written /sin. Further we want /parent_of=child_of. MATHS defines
      39. For R:@(T1,T2), /R::@(T2,T1)= (2nd) R (1st). Given two binary relations R and S then
      40. R; S = `The relation between (1st) and (2nd) such that for some z, (1st) R z S (2nd)`,
      41. R | S = `The relation between (1st) and (2nd) such that either (1st) R (2nd) or (1st) S (2nd)`,
      42. R & S = `The relation between (1st) and (2nd) such that both (1st) R (2nd) and (1st)S (2nd)`. For R: @(T1,T2), S:@(T2,T3),
      43. R;S::@(T1,T3)= rel[x:T1,y:T3](for some z:T2 (x R z and z S y)),
      44. S o R::@(T1,T3)= R;S.
      45. R::@(T1><T2)= {(x,y):T1><T2 || x R y }. In context, a set can also be used as a relation:
      46. For A:@T, A::@(T,T)=rel[x:T, y:T](x in A and x=y). Complex situations can be defined quickly using these notations [From Logic.Natural.KINSHIP]: male, female::@being.
      47. parent::@(being,being)=(1st)is a parent of(2nd).
      48. ...
      49. child_of::=/parent. father::=male;parent.
      50. mother::=female;parent.
      51. son::=male;child_of daughter::=female;child_of.
      52. grandfather::=father;parent.
      53. ...
      54. granddaughter::=daughter;child_of. grandson::=son;child_of.
      55. ... sibling:@(being,being)::= (/parent;parent) ~ (=). -- I am not my own sibling!
      56. sister::=female;sibling.
      57. ...
      58. uncle::=brother;parent. niece::=daughter;sibling.
      59. first_cousin::=/parent;sibling;parent.
      60. second_cousin::=/parent;/parent;sibling;parent;parent.

        Relationships change in real systems. Indeed most Z specifications contain scheme that describe the changes that can occur in a collection of sets and relations. For example, in a university in section 5.6 above a student can add and drop sections of classes:

      61. For t:Students, n:Section,
      62. add(t,n)::=add student t to section n::= not(t L n) and L'=L | {(t,n)}.
      63. drop(t,n)::=drop student t from section n::= (t L n) and L'= (pre(L)~{t});L.

        Relationships can also be defined in a EBNF like form - complete with recursion:

      64. lineal_descendant::= child_of | lineal_descendent; child_of,
      65. ancestor::=parent | parent; ancestor
      66. cousin::=/parent;sibling;parent | /parent;cousin;parent,
      67. first_cousin_once_removed::=/parent;first_cousin | first_cousin;parent.
      68. [From Logic.Natural.KINSHIP]

        The definitions of ancestor and lineal_descendant illustrate the transitive closure or ancestral relation of parent and child. The reflexive transitive closure is important and defined as follows:

      69. For all Types T, R:@(T,T),
      70. do(R)::@(T,T)=The reflexive transitive closure of R ::= `The relation between x and y such that x=y, or x R y or x R;R y, or...`. Example:|-ancestor=parent; do(parent), and lineal_descendant=child_of; do(child_of). The defining property of do is that if something is invariant (or fixed, or closed) under R then it will not be changed by any number of Rs:
      71. For R:@(X,X), F:@X, if F in inv(R) then F.do(R)==>F.
      72. Example: families.do(parent)==>families.
      73. Where inv(R)::= the invariants of R::=`the sets X such that if x in X and x R y then y in X` ::={F:@X || F.R==>F }. Example: families::=inv(parent).

        We can also define the power of a relation [cf Antoy & Gannon 94] via the following equations:

      74. R^0 = Id, R^(n+1)=R^n;R, R^(-n)=(/R)^n. The above definitions are equivalent to the well known forms:
      75. |-do(R) = Id | R | R^2 | R^3 | ... = |{ R^n || n:0.. },
      76. |-do(R) = &{ Q || Id==>Q and R ==>Q and Q;Q==>Q }. Similar to the idea of an invariant set of states is the idea of a stable function:
      77. stable(R)::=The stable functions under R::={f || R; f =f }. and, so if f in stable(R) then f in stable(do(R)). Also `for all R and f, f is stable under no(R)`. These ideas are a simple and intuitive way to handle mathematical induction, but have a solid logical base in Principia Mathematica [Whitehead & Russell]. For example, the following expression:
      78. SUM::=(n'=1 and s'=0; do(n<>101; s'=s+n and n'=n+1); n=101). SUM looks like a program but is an expression in the calculus of relations. This has a positive and a negative consequence. As an expression it has provable properties and can be manipulated rigorously into equal forms. However, some equal. are quite different programs! For example, we can prove that if m is greater than 0 then
      79. (p'=m*n and i'=m+1) is the same relation as
      80. ( p'=0; i'=0; do(i<m; p'=p+n; i'=i+1); i=m+1) however they performance quite differently when running on a machine.

        Returning to how we could manipulate SUM above: First since the (+) is defined to be an operator on numbers, so the smallest state space is ${n,s:Numbers}. Next define names for the parts of SUM:

      81. S::=(n'=1 and s'=0).
      82. L::=(s'=s+n and n'=n+1).
      83. E::=(n=101). W::=not E.
      84. |-SUM= (S; do(W; L); E). Now s is stable under E:
      85. E ==> (s'=s) and after E we know that n=101 because
      86. post(E)=img(E)={ x :X || x.n=101}. Similarly after S we know that n=1 and s=0. Further W by definition will not change s and n. Next we look for something that does not change in L. Both s and n change - in parallel:
      87. |-L = ( (s',n')=(s+n,n+1) ). With a little creativity or experience we discover the following invariant set of L:
      88. F::= ( s=1+2+3+4+...+(n-1)). (In a well written code, of course, this invariant is a comment in the source code)
      89. |-F; L ==> F,
      90. |-F in inv(L). Similarly|-F in inv(W) and inv(E), so|-F in inv(W;L). Thus|-(F; do(W; L)) ==> F.

        Now|-S ==> F and F;E==>F. so

      91. |-SUM = (S; do(W; L); E) ==> F and n'=101

        Hence, after SUM, s= 1+2+3+4+...+(n-1) where n=101 = 5050.

        Notice that MATHS is not a programming language. So

      92. x (do(W; L)) y is not the same as
      93. x (do(W; L); E) y. in the first case y.n can have many values. In the second y.n will be 101. Similarly, when
      94. x ( i=1; i'=2 | i'=3 ) y permits y.i=2 even if i<>1, but if x.i=1 then y.i MUST be 2. Whatever happens y.i is either 2 or 3. Such non-determinism seems to be essential for a simple model of processes[Dijkstra 76, Hoare 78, Hoare 85, Gabrielian & Franklin 91].

        MATHS includes structures like those of structured programming because: For R:@(S,S),

      95. no(R)::@(S,S)= ((1st)=(2nd) and for no y((1st) R y). For A:@S, R:@(S,S), while(A)(R)::=(do(A; R); no(A)), For I,J,K:@(S,S),R:@(S,S), for(I, J, K)(R)::=(I; while(J) ( (R); K ) ), For A:@S, R,T:@(S,S), (if A then R else T fi)::=(A; R | no A; T). We can, for example, when n<=m define:
      96. SUM(f, i, n, m, s)::=for(i'=n, i<=m, i'=i+1)(s'=s+f(i)) and show
      97. SUM(f, i, n, m+1, s) = SUM(f, i, n, m, t) and s=t+f(m+1).

        MATHS expresses something like an extension of Dijkstra's Guarded Commands - see chapter 6 for details. Any group of MATHS users can document and then use its own set of abbreviations. For example:

      98. ADA_PDL::=
        Net{
          ...
        1. For A,B,C:@(S,S), loop A; exit when B; C; end loop::= (do ( A; no(B); C); (A; B) ).
        2. ...

        }=::ADA_PDL.

        MATHS can express mixtures of conditions and actions in one relation. A well known enigmatic program is often written:

      99. Enigma::= ( while(n>1) (
      100. if n in even then n'=n/2
      101. else n'=3*n+1
      102. fi
      103. ) ).

        In MATHS it can be transformed:

      104. |-Enigma=while(n>1)( 2*n'=n | 2*n'=3*n+1 ).

        MATHS makes the derivation of programs more logical: (1) state the properties of the changed variables as a set of equations and then (2) solve for the primed variables. Similarly, a non-deterministic relationship can often be refined to a form that is compatible with a declarative programming system such as Prolog [Compare Lamport 87].

        Finally, MATHS relational expressions follow the calculus of relations which can be used to simplify sequences into disjunctive normal form, like this:

      105. R = A1 and A2 and A3 and... or B1 and B2 and... or... Typical simplifications are
      106. |-R;(S|T) = (R;S) | (R;T).
      107. |-(P(x, y); Q(x, y, y')) = (P(x, y) and Q(x, y, y')).
      108. |-(Q(x, y); x'=e(x, y) ) = (x'=e(x, y) and Q(x , y)).
      109. |-(x'=e(x, y); Q(x, y, y')) = (x'=e(x, y) and Q(e(x, y), y, y')).
      110. |-(P(x, y,y') | Q(x, y,y')) = (P(x, y,y') or Q(x, y,y')). However notice that these results say that the simplifications give expressions that have the same effect, not they are the same program.

        2.6.5 Serial Data

        Programs can access data from a source, one datum at a time. They need to put items into a stream of data one at a time. Similar commands send streams of messages to other programs or data into files. MATHS's string and list model serial streams of data. The string and list expressions in dynamic predictaes mimic reading and writing.

        Suppose we are interested in objects in a set A. #A is the set of sequences of objects in A. MATHS defines the following operators (See Chapter 4(section 8), chapter 5, and sample at the end of book):

      111. s ! t::#A=the concatenation of strings s and t. e ? s::#A=result of putting object e in front of string s. s ? e::#A=result of putting object e behind string s.
      112. s:!x = ( s'=s !x ) -- put a copy of x 's value at the end of s,
      113. x!:s = (x ! s = s') -- put a copy of x 's value in front of s [compare Lam & Shankar 90].
      114. e?:s = ( e ? s' = s ) -- e was the first item of S and is removed.
      115. x'?:s = ( x' ? s' = s )-- the first item is popped off s into x
      116. s:?x' = ( s' ? x' = s ) -- get the last x from s A small technicality arises - for inter-process communication there must be an explicit "end of sequence" signal after the end of the data. MATHS uses the word 'end' as a generic "There ain't gonna be no more" message. By convention 'end' is never used as data in a string.

        Here is an example Data Directed solution to the problem of replacing double asterisks by a caret symbol [Hoare 78, compare Chandy & Misra 88].

      117. star::="*", caret::="^". The sets of possible inputs and outputs are I and O respectively:
      118. I::= #( star star | char) end,
      119. O::=#( caret | char) end. Where star star corresponds to caret and I to O:

        The variables i and o are used for the standard input and output of a program so:

        The above looks deceptively like pseudocode, but may demand unusual coding techniques (Appendix 2). The theory is developed in chapter 3, 4, and 5, and its application to programming is in chapter 6.

        2.6.6 Documentation.

        There is no advantage in using MATHS only as a PDL. MATHS is for recording ideas and defining terms throughout a project. For example it is not useful to a client to just quote a specification and algorithm:
      120. euclid is a program that calculates the greatest common denominator(g):
      121. euclid(x,y,g)::=(x=y=g' | x<y; euclid(y,x,g') | x>y; euclid(x-y,y,g') ).
      122. [transcribed from a Prolog text - Schnupp & Bernhard 87, p36] An abbreviated form would be worse [Knuth et al 89]. It is better to define the concepts that explain why and derive what to do from these. MATHS lets you do this because you can gather a collection of declarations, definitions, comments, and assertions into a special kind of set - a set of documentation. The formulae are shown between matching "Net{" and "}" strings. They can be named: GCD::=Net{...} and referred to by Uses GCD. In Chapter 5 notations for customizing such nets to other uses are presented.
      123. GCD::=
        Net{
          Theory of the greatest common denominator of a set of integers.
        1. Uses Number_Theory.
        2. .SeeChapter 4 Logic. Section 8 STANDARD
        3. .See Chapter 4 Logic. Section 9 Other Systems.Number_theory
        4. |-Nat::=Natural. |-Nat0::=Positive. |-Nat0=Nat|{0}. -- Definitions in Number_Theory |-For x,y :Nat0, x divides y ::= for some f:Nat0(x*f=y),
        5. divisors(y)::= {x:Nat0 || x divides y},
        6. multiples(x)::={y:Nat0 || x divides y}. -- Definitions in Number_Theory
        7. ...
        8. |- (1): for x,y,z,a,b:Nat,
        9. if x and y divides z then a*x+b*y divides z.
        10. ...
        11. For x,y:Nat, gcd(x,y)::Nat=greatest(divisors(x) & divisors(y)),
        12. ...
        13. |-for x,y, gcd(x,y)=gcd(y,x) { ...(&) in commutative(@Nat0) ...}.
        14. |-for x,y, if x>y then gcd(x,y)=gcd(x-y,y).
        15. ...
        16. euclid::= (do( x<y; x'=y and y'=x | x>y; x'=x-y ); x=y=g').


        17. |-for x,y,g:Nat,
        18. (x=>x,y=>y,g=>g) euclid (x=>x', y=>y', g=>gcd(x,y)).
        19. ... }=::GCD

          2.6.7 Names

          Documentation names things. It is not easy to give the correct names to things, but it is important that (1) the names and meanings be clearly documented, and (2) ambiguities and confusions are uncovered [Katzenberg & Piela 93, Williams & Begg 93]. Algebra and natural languages have two different paradigms for creating names [MacLennan 79]. "X" is clearly an algebraic name or variable and "cats" an English name. MATHS allows both to be used in formulae. A natural name is an unabbreviated phrase in a natural language with spaces replaced by an underline symbol. The words must be in a standard dictionary to make a natural name. There exist some special mathematical identifiers that are not used in a natural language - "sinh" for example - thus a dictionary of acceptable functions and the like is needed for each environment. Short algebraic variables keep formulae and proofs manageable. Many things have both short and long names - the short used by the developer of a document and the long by the user of that document. An ideal system will permit the rapid replacement of one form by the other. The guideline is that shorter names are relatively local and natural names relatively global. Parameters in definitions and variables declared in predicates should always be algebraic identifiers(variables):
        20. variable::=general_letter (subscript|)(superscript|), subscript::=digit #digit | "[" expression "]" |...
        21. superscript::=prime|..., general_letter::=ASCII.letter | Greek.letter | ... .

          2.6.8 Expressions

          If you type in mathematics in ASCII then it is not be possible to recognize where expressions begin and end unless layout is a clue. MATHS is defined so that an expression can be recognized but with the minimum restrictions of traditional mathematical notations. Punctuation is used to help the recognition of expressions. In particular commas and periods delimit a formula. Thus a comma can occur inside an expression only in two contexts:
        22. (1) Inside quantified predicate, which starts with either "for" or "For", or
        23. (2) Inside strings, parentheses, braces, or brackets. A comma in other contexts is not in an expression. A MATHS period is defined to be an ASCII dot character followed by some non printing white space [from COBOL]. Inside an expression a dot has to be followed by something else (as in 3.2, n.th, or u.x) or else it is interpreted as a period. A period can appear inside expression only when the dot and whitespace are inside a string or a pair of brackets, braces, or parentheses. Now a string is a lexeme in MATHS (see chapter 3 for details). Thus only a sequence of lexemes without commas or periods can be an expression.

          A second requirement is that in the possible expression the types of the symbols fit together. Variables and terms have types declared for them. When ambiguous the name of the type can be used to cast the term or variable. Operators and functions act on ambiguous expressions and form expressions of a particular type: For T:types, e:expression, T(e)=e.T has type T. For f: T1->T2, e1:T1, e2:T2,

        24. f(e1)=(e1).f and are both expressions of type T2,
        25. /f(e2)=(e2)./f and are both expressions of type @T1 For e1:T1, e2:T2, ..., f: (T2><T3><...)->T1,
        26. f(e2,e3,...)= (e2,e3,...).f and are both expressions of type T1. For e2:T2, e3:T3, f:(T2><T3)->T1,
        27. (e2 f e3) = f(e2, e3) = (e2,e3).f and are all expressions of type T1. In many cases the parentheses around e2 f e3 can be omitted because types and precedence rules determine the correct parsing. An expression like x f y g z may be ambiguous but in MATHS the types of x, y, z, f and g usually fix the correct interpretation. For example, a+b>c is equal to (a+b)>c because b>c is of type @ and (+) has type Number><Number->Number. It might be ambiguous if (+) is also defined to be of type Number><Boolean->Number, or where (>) is defined to return numbers. This is the reason why MATHS has different symbols for the operators on numbers, sets, and truth values. When the types cannot fix the correct structure, precedence rules are used. For example consider x f y g z when x, y, z have a type T and both f and g have type (T><T)->T. If f takes precedence over g then:
        28. x f y g z = (x f y) g z = g(f(x,y),z) =((x,y).f,z).g, If g takes precedence over f then
        29. x f y g z = x f (y g z) = f(x, g(y,z))=(x,(y,z).g).f. Here are the precedences that always hold when the symbols "^/*+-..." are used as infix operators:
        30. (^) takes precedence over (/)
        31. (/) takes precedence over (*)
        32. (*) takes precedence over (+) and (-)
        33. (;) and (o) and ( ) take precedence over (|)
        34. (&) and (~) take precedence over (;), and (o) and ( )
        35. (><) and (^) take precedence over arrows like (->), (==>),...
        36. (and) and (not) take precedence over (or) The rules for ":", "::=", "=", iff, and if__then__else are defined separately (Chapter 4 Logic). Notice that care needs to taken with operators that have different meanings depending on the context in which they appear(overloaded or polymorphous) - see chapter 4, section 8 and chapter 5 for details.

          Serial & Parallel Operators In mathematics

        37. a+b+c+d = ((a+b)+c)+d and
        38. (a<b<c<d) = (a<b and b<c and c<d). MATHS generalizes these two conventions.

          If + is a binary operator(with type T1^(T1><T1) say) and is defined to be serial, then for e1,e2,e3,e4,...,e[n]:T1,

        39. +(e1,e2,..., e[n]) =e1 + e2 + e3...+ e[n]=(e1, e2, ..., e[n]).+, and +[i:1..n](e[i]) = e1 + e2 + e3...+ e[n] Examples: is written +[i:1..20](i^2) and factorial(n)=*[i:1..n]i. Examples of serial operators are (and), (or), (&), (|), (!), (+), (*), (min), (max), . . . There is no need for special symbols for summation and product. An ideal MATHS system would map prefixed operations into special symbols on output if the device can handle them. The idea of serial operators is an alternative to
        40. APL's special row and column reduction operations The UNITY constructor: UNITY(< op v : v in P :: E(V)>) = op[ v: P ](E(v))).
        41. Defining special symbols for each case as in ΤΕΧ, mathematics, and Z.

          Homogeneous relations are operators of type (T1><T1)->@. They are all parallel operators with two exceptions: (and) and (or). So for any relation R:

        42. e1 R e2 R e3...= R(e1,e2,...) =(e1,e2,...).R= (e1 R e2) and (e2 R e3)...
        43. <(1,2,3,4) = (1<2<3<4) = 1<2 and 2<3 and 3<4 = (1,2,3,4).< Examples of parallel operators are "iff", =, "in", <, <=, <>,>=, ==>, >==, ->,

          . . .

          Thus <=(file) = file is sorted.

          2.6.9 Abstraction

          MATHS is more abstract than any other computerized language. A language used at the front end of a project must be able to define problems well enough to recognize that they are (1) hard, or (2) non-computable. Problems of efficiency, computability, data vs algorithms, etc. are handled separately in terms of qualities and the techniques available. The MATHS description of the problem and the specification of a requirement should be independent of all implementations and technology. Ignoring efficiency etc may be a little perturbing at first. It would be bad if MATHS was a programming language. It is essential to describe the purpose and qualities of something before choosing techniques.

          Consider the GCD example in section 6.6 above. Euclid's Algorithm is the standard solution. However if one has to evaluate a many GCD's then Euclid's algorithm is clearly inefficient, and remains so when coded iteratively. For example, when gathering statistics on the GCD's of all pairs of numbers in the range 1..2^p as p->oo I found that Euclid's algorithm recalculates the same values many times. Backtracking to the definition in section 6.6 we have:

        44. For x,y, gcd(x,y)::=greatest(divisors(x) & divisors(y)). So: gcd in (Nat >< Nat)->Nat ==> @(Nat, Nat, Nat). gcd could be code, or a table, a data base, a method of a class of objects or a data structure. Even a hybrid of an algorithm and data - a cache [Bentley 93] is possible:
        45. Let Cache::@(Nat, Nat, Nat)=Cache of calculated results.
        46. For x,y,z:Nat, if (x,y,z) in Cache then z=gcd(x, y).
        47. |-Cache ==> gcd. Define
        48. euclid_plus::= ( x0'=x; y0'=y; --(so we can store (x0,y0,gcd(x0,y0)) when found)
        49. do( no (x, y, g') in Cache; --(fail to find (x,y,gcd) in Cache)
        50. ( x<y; x'=y and y'=x
        51. | x>y; x'=x-y
        52. ) --(calculate a x,y pair with same gcd)
        53. );
        54. ( (x,y,g') in Cache --(find gcd in Cache)
        55. | x=y; g'=x; Cache:|{(x0, y0, g)} --(put gcd in Cache)
        56. ) ). The form of euclid_plus is
        57. LOOP::=do ( no A ; ( x<y; b | x>y; c) ); ( A | x=y; d ) where A=((x,y,g') in Cache), and b, c, d are all total functions. Only one of x<y, x=y, and x>y can be true. So LOOP is a Zahn loop [Zahn 69]. So Euclid_plus has the invariant Cache ==>{ (x, y, gcd(x,y) ) || x,y:Nat}. If it is true before each call then it is also true after the call. If Cache={} then euclid_plus ends up with (x0,y0,gcd(x0,y0)) being placed in Cache by Cache:|{(x0, y0, g)}. By induction, the design works if initially Cache is empty and is not changed by any other piece of code. When Euclid_plus is coded a representation of Cache will be chosen and a limit will be put on the size of Cache. Code is often a finite approximation to a design. However Euclid_plus works if Cache:|{(x0, y0, g)} is changed to Cache'==>Cache|{(x0, y0, g)}. ADTs are in chapter 5 (section 6.6 of chapter 5). Classes & objects are described in Chapter 6.

        . . . . . . . . . ( end of section 2.6 Special Techniques) <<Contents | End>>

        2.7 Summary

        MATHS is an unusual computerized language. Its philosophy includes these ideas:

        • Formulae should be descriptive, clear, and correct. Efficiency comes later.
        • Documentation has both formal and informal parts. Both are important.
        • Formula may be rephrased and restated several times. Controlled redundancy can help the reader.
        • Formulae specify results and properties but do not force the use of one algorithm or data structure.
        • Some formula look like programs but take intelligence to be interpreted and/or coded.
        • Some formula can be implemented as either data and/or program, in many different ways.
        MATHS is designed to support Formal Analysis, prior to Design and Implementation. Formal Analysis proceeds by stating obvious and so certain truths. Simple facts delimit the set of logically correct structures. Grammars, lexicons, and other dictionaries are useful for documenting the data and the dynamics of a situation. Relations can specify processes and they can also be defined in a structured dictionary. Sets and maps describe the structure of the problem domain and are used to document requirements and specifications. They define entity types, abstract data types, and classes of objects. When implemented, maps become paths through the data structures and data bases. So they define the structure shared by solution and problem.

        When there is a need for unobvious statements then MATHS provides the ability to show that the uncertain statements must follow from less doubtful ones. A special sign (|-) shows that the formula can be derived from previous formulae. An complete inference engine or automatic theorem prover may be an uneconomic investment since the engineer can work interactively with a simpler proof checker [Chapter 4].

        Some documentation says too little, and some says too much. Intuitively we have a category of logical systems. In this category of documentation we can distinguish two extremes. There is a free, initial, or universal repelling object in the category of documented objects. Free or initial documentation is the weakest useful documentation. At the other extreme there are final, inconsistent, or universally attractive documentation - this overdefines the situation to the point where it can not exist. In between there is the level of documentation that reflects exactly what the client understands the situation to be. I call this the canonical documentation. It forms the cannon or Bible for the project. It shows the user or client that we understand all their terminology. Canonical documentation shows the user's world in terms that the users and clients accept. Ideally, the client reacts: "But, of course, that's what I mean,... ." The test of canonical documentation is that clients and users certify it as complete and accurate. Canonical documents are the strongest useful conceptual documentation.

        Project documentation also describes the physical constraints caused by the technology (software+hardware+methods) used. Again there are two extremes: free and inconsistent. Some where between them is the best level. This records every assumption that can be made about the implementation technology. It is the canonical documentation of the technology being used when the software engineers are its clients and users. Ideally it would be provided, as a matter of proffessional ettiquete, by the supplier of the implementation technology - like a supplier of chips documents the logical and physical properties of their circuits, but not the internal blueprints.

        MATHS & TEMPO help solve several outstanding problems in the software process. First, we can show the user the external design of software [Hoare 73]. The diagrams and formulae show how the user's data is handled in terms that they understand(sections 2 and 3 of this chapter). In chapters 3, 5 and 6 I will show how we can guarantee that this understanding is accurately modelled in the software. Second, Strachey pointed out the need to discover data structures before attempting functional design [Strachey 66]. The sets and maps found by formal analysis connect facts to conceptual data structures(section 4). Standard data base and data structures techniques can then implement the conceptual solution to the problem(section 5). Finally practice has added special features(section 6).

        Projects that use MATHS can "zero-in" on the best software. First: Define things that appear in the description of the problem and (1) are outside the software, (2) interact with the software, and (3) are individually identifiable by the software. Second: Define events, entities, identifiers, relationships, attributes, facts, and patterns. These steps are an exercise in stating the obvious. Facts are recorded and verified by experts and by observation. Simple facts lead to validation of algorithms, data structures, and data bases.

        If a system interacts with another system then it will implement an internal model of the external system - even if the producers have failed to document it. However an abstract model of the external system is also a model for the internal system and so a structure for a design. Computer Science already teaches the techniques to implement the designs plus the calculus of sets, maps, and relations. So software design can be based on science. There is also a natural graphic representation (TEMPO). This is, by chapters 0 and 1, a basis for Software Engineering.

      . . . . . . . . . ( end of section Chapter 2) <<Contents | End>>

    End