[Skip Navigation] [CSUSB] / [CNS] / [CSE] / [R J Botting] / [ ] / types
[Contents] [Source Text] || [Notation] || [Copyright] || [Contact] [Search ]
Wed Apr 18 06:44:50 PDT 2012

# Introduction to Types

## Personal Note

When I started this page I did not realize how much information it would have to carry to make the MATHS language useful.

## Introduction

In any project there will be a number of more less defined types. Obviously in computer projects we have have "data types", "Entitiy types", "ADTs", classes of objects, or other technologically bound models of reality. The MATHS type system has to handle these. But we will also (in real projects) have to think about the real world -- cats, dogs, candlesticks, and sealing wa, cabbages and kings. These are the objects that the software describes. And in theoretical work we will often be working in terms a set of elements with given operations and properties -- algebras and abstract data types.

Some of these are standard well known types and are described below. Then there are the types that are peculiar to the domain being studied. This set of types is called the basis of the domain. Finally there are the types that can be constructed from the basic and standard types to give structured types (sets, tuples, relations, ...). This collection of basic, standard and constructed types is called the universe of discourse of the project.

The key discovery by Whitehead and Russell in their work on "The Principles of Mathematics" was that the idea of type could be used to avoid paradoxes as long as it was a syntactic concept. This means that each type defines those expressions that are valid or meaningful as well as what meaning can be attached. The Type system outlaws from the logic any discussions of expressions that involve contradictions in terms and paradoxes like the Russell Set. Thus any theory of types is a metalinguistic activity and taks place in yet another universe of discourse.

Simply put: Each type is linked to a set of expressions and a set of meanings and connects them together. And each type does this in its own way. Types are the main way that MATHS is extended.

All types must define a non-empty set of objects -- a valid universe of discourse -- so that we can use the predicate calculus to reason about objects of that type.

## Standard Types

The following are common descriptions of types

(Alphabets):

1. Generic finite set. Probably characters or symbols.
2. Operations=>{(!), (1st), (rest),...}, concatenation, disassembly of strings.

(@):

3. The propositions, Values=>{true,false},
4. Operations=>{and,or,not,...}

(Char): A Finite set of 256 characters used in MATHS.

5. The ASCII characters, See [ Main%20Content in comp.text.ASCII ]
6. Values: "a", "b", char(26),...

(Unicode):

7. Used to describe languages like Java, JSON, Javascript, etc that use Unicode [ ../samples/glossary.html#UNICODE ]
8. Values: To Be Done [click here Unicode if you can fill this hole]

(Documents):

9. Pieces of Documentation,
10. Operations=>{\$,...and,or,...}, see below.

(Events):

11. A finite set of elements written in a different language.

(Numbers):

12. Type that includes real,integer, etc types,
13. Operations=>{+,-,*,...}

(Dimensions):

14. Length, Mass, Time, Information, Area, Angle,...
15. Dimensions, each is a separtate type.
16. Operations=>{+,-},
17. For all D:Dimension, some units:@(Number->D). (eg 60.sec=1.hr,8.bit=1.byte)

(Sets):

18. Generic, For S:Sets is short for For T:Types, S:@T,
19. Operations=>{&, |, ~, ...}

(Times):

20. A special ordered set with elements written in a different language

(Types):

21. Generic

## Basic Types

In real systems we work with universes of discouses like: Employees, Products, Animals, Telephones,... Michael Jackson notes that we need to be able to designate individuals in these domains. Terry Halpin has noted that the individuals/elements/objects/"values" can be described by a definite description like this
` 		the Employee with SSN 123-45-6789`
` 		the[e:Employee](e.SSN="123-45-6789")`

It seems wise to introduce -- as a partly-baked idea -- a standard way to designate objects of a type:

` 		Employee("123-45-6789")`

Here is a possible formalization:

22. BASIC_TYPE::=following,
Net
1. T::Types.
2. identifier::Set=given.
3. designation::identifier >-> T.
4. |-designation in identifier --- T, a one to one mapping.
5. For i:indentifier, T(i)::T= designation(i).

(End of Net)

So we might define, as an example,

23. example::=
Net{
1. SSN::Sets.
2. ...
3. Employee::Types.
4. |-BASIC_TYPE(Employee, identifier=>SSN).

}=::example.

By further defining functions, predictaes, sets, relations, etc. we can easily model the types of objects in the real world. This conceptual model is an excellent guide to designing data bases, a natural ontology for the domain, and a source of ideas for classes in object-oriented programming.

## Type descriptions

A Type is defined to have two key attributes attributes:

24. |-TYPE::=Net{
25. description::#lexeme=symbolic representation of the type.
26. objects::Sets=set of values or things of this type.
27. ...

}=::TYPE.

1. Type_descriptions::=
Net{

Notice that a type T is not a member of a type.

## Syntax

1. type_description::= type_power | type_product,

2. type_power::=type_product "^" type_product, Note -- this outlaws type descriptions like T1^T2^T3 to avoid an ambigutiy, since we can show that (T1^T2)^T3 is different to T1^(T2^T3).

An example

3. {0,1}^{0,1} has the set of maps of bits into bits including elements like the identity and negation operation (not).

4. type_product::= type_factor | type_factor "><" type_product, Note -- this defines the associativity of products of types: T1><T2><T3 must be parsed as (T1><T2)><T3.

The elements of Char >< Number for example all have a sigle character and a single number in them.

5. type_factor::= "@" type_element |type_element,

The type @T has the set of all subsets of T as elements.

6. type_element::=name_of_a_type | "(" type_description ")" | type_discriminated_union | enumerated_type.

7. enumerated_type::= "{" list_of_values "}".

If D is a list of declarations then \/{D} is a discriminated union or co-product. There is also a longer form that uses the "Net" syntax.

8. type_discriminated_union::=co_product | long_form_of_discriminated_union.
9. co_product::="\/" "{" list_of_declarations "}".
10. list_of_declarations::= declaration #( "," declaration).
11. declaration::= identifier ":" type_description.

A discriminated union like

12. \/{ character: Byte, number:Nat } has elements like
13. character("a"), number(1), character("1"), etc.

Discriminated unions are useful for decribing data bases and partial tples. A possible semantics is found in the theory of co-products in Category Theory(CATEGORY).

14. CATEGORY::= See http://cse.csusb.edu/dick/maths/math_25_Categories.html.

Briefly a product defines a composite structure with maps from the composite to the components. A co-product has the reverse maps -- from each component into a unique element of the co-product.

## Partly Baked Ideas

Two ideas
Net
The long form leads to better rendering. The "\/" symbol is followed by a net of declarations.
1. long_form_of_discriminated_union::= "\/" net TBA.

The meaning, is to extract the declarations, ignore the constraints, and create a list D of declations for the short form.

(End of Net)

Net
This is needed to define rational numbers (for example) as a type.

1. type_quotient::= type_element "/" (equivalence_relation \ function).

A type_quotient T/E where E:equivalence(T) is defined as any chosen isomorphism of the set of blocks or equivalent elements of T. Simarlarly T/f where f is a map from T into T2 (f:T2^T), then the lelements of T/f are the inverse images in T of elements in T2 -- for each t2:T2, { t:T. f(t)=t2}. See [ Quotient types ] below.

(End of Net)

## Semantics

As a rule the description is a string that describes the objects in the following way:
15. |- (desc-objects): For all T1,T2:Types, if T1.description= T2.description then T1.objects=T2.objects.
16. |- (hat_means_power): For all T1, T2, T3:Types, if T1.description=T2.description "^" T3.description then T1.objects=(T2.objects)^(T3.objects).
17. |- (at_means_subsets): For all T1, T2:Types, if T1.description="@" T2.description then T1.objects=@(T2.objects).
18. |- (cross_means_product): For all T1, T2, T3:Types, if T1.description= T2.description "><" T3.description and T1.description in type_factor then T1.objects=(T2.objects) >< (T3.objects).
19. |- (parentheses_mean_parentheses): For all T1,T2:Types, if T1.description="(" T2.description ")" then T1.objects=T2.objects.

Need to give formal semantics for coproducts, and quotients.... and tidy up. [click here types if you can fill this hole]

}=::Type_descriptions.

## Standard subsets/classes

### Finite_Sets

This description
1. Finite_sets=sets which are not isomorphic to a proper subset of themselves, is not type-safe. The only type safe expression is to talk about the finite sets of elements of a given type, like the following:

2. For Type T, Finite_sets(T)::@T=subsets that are finite.
3. For T, S:@T, Finite_sets(T)::@T={F:@T. F in Finite_sets(T)}.
4. For T, S:@T, finite(S)::@=S in Finite_sets(T)}.

The idiom

5. Let A by a finite set, ... is common in computer science and should be interpreted as short hand for
6. T: Type,
7. A: @T,
8. |-finite(A).

Where T is some unused variable.

These can be defined by enumation: Syntax: {a,b,c,...}, or by construction since the union of two finite sets is finite and the intersection of two finite sets is finite.

Some theories of finiteness can be found at [ logic_6_Numbers..Strings.html ] discussing numbers.

### Integers

9. Int, Z
10. Integers,Whole Numbers,
11. Values=>{...-3,-2,-1,0,1,2,3...},

### Numbers

12. Rational, Q Quotients/Ratios
13. Fractions \$(numerator:Nat0,denominator:Nat,...}
14. Real, R The Real Numbers,
15. n..m::={i:Z || n<=i<=m}.
16. Bit::=0..1.
17. n..::={i:Z || n<=i}.
18. ..m::={i:Z || i<=m}.
19. Natural::={n:Z || n>=0}.
20. Nat0:=Natural.
21. Positive::={n:Nat0 || n>0}.
22. Nat::=Positive.
23. (above)|-Nat0 = Nat | {0} = 0.. .
24. Fixed(n)::==-2^(n-1)..2^(n-1)-1.
25. Float(p,s) A finite subset of the Rational numbers (denominators=2^-i where i:Fixed(s),...) Decimal(p,n):@Rational::=(-10^n+1..10^n-1)/(10^p),
26. Money::=Int/100.

Hence

27. |-n..m in @Z.
28. |- T^n::= T^(1..n).
29. |-for n:Nat0, T^n ==> #T. So if x is in T^n then x has type #T.

### Standard subsets/classes

As a generallity the type of an expression can always be determined (sometimes relative to certain assumed types), from the type of the parts and the operators that make it up. Even so a modicum of overloading is allowed - an symbols can be defined to carry out different operations, (and return different types, depending on the context. Such a definition is written: term::context=expression For example, here are three different views of an assignment:

1. assignment::syntactic= variable ":=" expression,
2. assignment::parsed= Net{v:variable, e:expression,...},
3. assignment::semantic= map [v:variable, e:expression]( s'.v=e(s) and for all u:variable~{v}(s'.u=s.u)).

(casts):

4. For T:Types, a:T, a::@T={a},
5. a::#T=(a),
6. a::%T=(1+>a).
7. For T1,T2:Types, f:T1->T2, f::@T1->@T2={f(x) || x:T1(x in (_) },
8. For T:Types, op:infix(T), op::infix(@T)={x+y||x in (1st), y in (2nd)}.

As a rule: An overloading can not take an existing valid meaning in the given context. So (|) is given a meaning for all sets and so can not be extended to operate on sets as well.

Polymorphic and generic terms have definitions written like this:

9. For T:Types, term::t(T)=expression(t(T))

where t(T) is some formula that contains T as a free variable. If the symbol T appears in the expression then it is interpreted as the universal set of a generic type T. In general T can be any list of symbolic variables. Completely generic terms ( "macros") can be defined but are rare. If v1,v2,v3,... are previously undeclared variables then

`        For v1,v2,v3,...,  M(v1,v2,v3,...) ::= e,`

creates a function that applies to all types of operands and is short for

`        For T1, T2, T3,...:Types, v1:T1,v2:T2,v3:T3,..., M(v1,v2,v3,...) ::= e.`

. . . . . . . . . ( end of section Introduction to Types) <<Contents | End>>

### Theory of Types

1. Theory_of_types::=
Net{
1. SEMANTICS::= See http://cse.csusb.edu/dick/maths/logic_7_Semantics.html

2. Types::@TYPE. We have a set of types, each one structured as a TYPE.

3. TYPE::=
Net{
1. description::#lexeme=symbolic representation of the type.
2. objects::Sets=set of values or things of this type.

Note. If T is this type then x in T.objects iff x has type T.

Symbols for types have two meanings, one of these is in a normal part of an expression where it represents the set of objects or an identity map from objects to objects depending on the context.

3. (dual_meaning): 2 symbols o /representation(description).

4. (description_is_symbol)): symbol(description, @objects, objects).

The other meaning is when the description identifies the type in a declaration.

5. elements::@meaning_in(objects),
6. elementary_expression::=elements.syntax.
7. |- (elem): elements in element.syntax->objects.

8. expressions::@#lexeme. The expressions of this type are defined in terms of objects in other types and so is not a property of this type alone. [See EXPRESSION below]
9. (elem_in_expr): elements.syntax==>expressions.

10. equality::equivalence_relation(objects). [STANDARD.relations]

11. infix_pairs::@meaning_in(infix(objects)). [STANDARD.infix]
12. infixes::=Lexemes referring to binary operations.
13. infixes::=infix_pairs.syntax

EXPRESSION uses infixes(T) to define expressions(T).

14. (precedence): STRICT_ORDER(infixes, has_precedence_over)Chapter 4, section 8]

Used in EXPRESSION.

15. propositional_functions::@references(@^objects).

Used in Proposition Calculus.

16. parallels::@meaning_in(@(objects, objects)).
17. parallel::=parallels.syntax.
18. (equals_is_parallel): ("=",equality) in parallels.

19. serial::@infixes={s.syn || for some s:infix_pairs( associative(s.sem)) }~parallel.

20. other_operators::=@symbols~infix~parallel,operators giving results of this type.

21. (operators_mean_maps): For all o:other_operators, some domain:Types, o.type=domain->objects.

22. operators::={(representation=>i.syntax, type=>infix(objects), meaning=>i.semantics) || for some i:infix_pairs} | other_operators.

}=::TYPE.

(syntax_symbolizes_semantics): For all T:Types, e:T.elements, symbol(e.syntax, T, e.semantics).
(descriptions_symbolise_types): For all T:Types, symbol(T.description, Types, T).

4. |- (type_symbol_means_universal): For all T:Types, symbol(T.description, @T.objects, T.objects).

(different_types_do_not_overlap): For all T1, T2:Types, T1=T2 iff some (T1.objects&T2.objects).

(parallels): For T:Types, P:T.parallels, PARALLEL(expressions(@), T.expressions, P.syntax, P.semantics).

(serials): For T:Types, s:T.serial, SERIAL(T.expressions, T.expressions, s, T.infix_pairs(s)).

5. ??{take precedence into account}?

}=::Theory_of_types.

#### Examples of theoretical types

(Example_PC)::Types=following, (
1. description=>"@",
2. elements=>{("true", true), ("false", false)},
3. equality=>( (true, true) +>true | else+>false),
4. infix_pairs=>{("and", and), ("or", or), ...},
5. serial=>{"and", "or"},
6. parallel_pairs=>{("iff", iff), ("=", equality), ...},
7. other_operators=> (
1. { ("=", T><T->@, T.equality) || for some T:Types}
2. |{("in", T><@T->@, in) || for some T:Types}
3. |...,
)
8. ...
)

or even

2. For T:Types, Example_Sets(T)::Types=following (
1. description=>"@"! T.description,
2. elements=>{("{}",{}), (T.description, T.objects)}
3. |{ "{"!e!"}" || e:T.elementary_expressions},
4. infix_pairs=>{("&", &), ("|", |),...},
5. serial=>{"&", "|"},
6. parallel_pairs=>{("=", =), ("==>", ==>), ...}, ...
).

. . . . . . . . . ( end of section Theory of Types) <<Contents | End>>

### Meta-theory of types

#### Babel-Aleph-Null

The definition of the type '@' in terms of type theory is an interesting test case of the system, and is rather like attempting to build the foundation of a building on its roof. The inabillity to this suggests limits to the system. The successful construction of a model of logic, on a theory of types, expressed in terms of that logic, establishes the power but not the consistency of the system.

1. @::Types=following,
1. symbol=>"@",
2. elementary_expression=>"true"+>true|"false"+>false,
3. infix_pairs=>"and"+>((true,true)+>true |+> false) |"or"+>((false,false)+>false |+> true),
4. parallels=>"="+>({(true,true),(false,false)}+>true |+> false) |"iff"+>({(true,true),(false,false)}+>true |+> false),
5. serial=>{"and","or"},
6. other_operators=> | following,
• { ("not",@->@,true+>false|false+>true)}
• {("if_then_",distfix(@,@,@),(true,false)+>false |+> true)}
• {("in", @T, in) ||T:Types}
• {("==>", @T><@T->@,==>)||for T:Types}
• {("=", @T><@T->@,=)||for T:Types~"@"}
• {("all", @^T->@, map F:@^T(img(F)=true)) ||for T:Types }

2. @.operators::=following,
Table
representationTypemeaning
not@->@true+>false|false+>true,
and@><@->@(true,true)+>true |+> false,
or@><@->@(false,false)+>false |+> true,
iff@><@->@{(true,true),(false,false)}+>true |+> false,
if_then_@><@->@(true,false)+>false |+> true,
inT><@T->@map x,X(x.X),
=T><T->@map x,y((x,y).(equality(T))),
all@^T->@map F:@^T(img(F)=true),
==>(@^T)><(@^T)->@map [A,B] (all(map x(if x.A then x.B))).
...

(Close Table)
3. quantified_wff::={( ("for" Q B "(" P ")"), (Q "(" "map" "[" B "]" "(" P ")" ")" )) || Q:{all,no,...}, B: (variable (colon | equals) set_expression), P:@.representation}

#### Consistency and Non-empty types

Types must always define a non-empty set of objects. And a simple way to prove this is give a finite example.

Consistency is demonstrated by exhibitting a finite model.

Another techniqu is to use the simplest (non-recursive) constructions.

Notice that sets are defined by a formula
4. {x:T||B}, which explicitly assigns a type to the variable. But the collection of all types is not a type so we can not define a set of all sets and so should avoid the paradoxes discovered at the beginning of the 20th century. Notations like X:Type or Type X can only be used to set of a generic formula.

We can prove that the set of data types leads to a consistent theory by constructing a model in terms of naive set theory and n-tples. The data types are those generated from @ and Nat by operations of fnite Products and coproducts. The proof of relative consistency is done by using Codd normalizations. Model each type as a set of n-tpls of elementary object. Replace T2^T1 by (T1><T2). The result is the Third Normal Form Model.

When defining new types recursion must not be combined with power and subset operators because these lead to paradoxes. Cantor Russell... [click here if you can fill this hole]

Care also needs to be taken with using generic operators (defined with an implicit type for convenence) in a way that the actual types involved are made explicit. This is the problem as found in C++ templates. In MATHS one example was the original definition of the function that returns the cardinality or size of a set:

5. Card:@T->Nat0= map[X](|X|).

So, this is true

6. Card{} = 0,
7. Card{"x"} = 1,
8. Card{"x","y"} = 2

The following expression

9. /Card(1) has an ambiguous type -- the set of singletons of unknown type. Instead make the type explicit
10. Strings@1 = sets of strings with one element.

. . . . . . . . . ( end of section Meta-theory of types) <<Contents | End>>

### Deriving New types

Wechler presents an extensive theory of univeral algebras [Wechler 93, Section 3.1].

#### Products

[ Notation \$Net{a:T1, b:T2 ...} ]

#### Quotient types

If there is a map from one type to another then there is an easy way to generate a new type - by grouping the elements of the second type according to their values under the mapping - this is called the quotient type with respect to the mapping. Given f:T1^T2 we can always derive a new type (symbolized T2/f) isomorphic to a sub-set of @T2 where
1. X:T2/f iff for some y:T1, all x:X(f(x)=y). This called the quotient type of T2 with respect to f.

Given a R:@(T1><T1) which is reflexive, transitive and symmetric (an equivalence relation) then there is a subset of @T1, symbolised T1/R, defined by T1/R::={X:@T1||for all x,y:X( x R y)}. Then there is a map (T1/R)^T1 from x:T1 to x/R:T1/R

2. x/R = {y:T1|| x R y}
3. ./R = map x:T1(x/R) Notice that
4. T1/(./R)=T1/R

We can define a relation equivalence modulo f x = y wrt f ::= f(x)=f(y) which leads to the same "quotient type".

#### Example Quotient Type

Constructing Q (the rational numbers) from Z (the integers).
5. Q::= ( (1st=>Z, 2nd=>Z)/equiv where ( (a,b) equiv (c,d) iff d a = b c)).

[ Notation] .

. . . . . . . . . ( end of section Deriving New types) <<Contents | End>>

### Typed Expressions

#### Expressions

To complete this section here are the definitions of the syntax of expressions. A string of lexemes that is not terminated by a comma or period occurring outside parenthesis of some kind or other, can be expression. But an expression can not contain a comma or period outside of parenthesis. You can therefore use punctuation to recognised balanced expressions. A second cut needs a set of operators - those symbols that only make sense when accompanying other symbols - and defines whether an expression is functional. The final check makes sure that the types of actual parameters match the operators on them. These will be presented in reverse order.

1. EXPRESSIONS::=
Net{
Uses SEMANTICS. Uses Type_Theory. Uses ASCII_code.
1. expression::= balanced & functional & ( |[T:Types] typed_expression(T) ).

We start be defining the set of fully parenthesized expressions(p_expressions) of a given type and then state circumstances under which parentheses can be removed without change of meaning.

2. p_expression::abbreviation=fully_parenthesized_expression.
3. fully_parenthesized_expression::Types->@#lexeme.

4. For T1: Types, p_expression(T1)::= following, (
1. elementary_expression(T1)
2. | variable(T1)
3. | UNARY(representation(T1), other_language_string)
4. | |{UNARY(p_expression(T2), p_expression(T1^T2)) || T2:Types}
5. | |{BINARY(p_expression(T2), p_expression(T3), p_expression(T1^(T2><T3)) || T2,T3:Types}
6. | |{INFIX(p_expression(T1), f) || f:T1.infix)}
7. | |{LAMBDA(representation(T2), p_expression(T3)) || T2,T3:Types(T1=T3^T2)}
8. | |{INTENTION(representation(T2)) || T2:Types(T1=@T2)}
9. | |{EXTENSION(p_expression(T2)) || T2:Types(T1=@T2)}
10. | |{N_TUPLE(p_expression o t) || t:#Types(T1= ><(t))}
11. | |{UNARY(p_expression(@T2), p_expression(@(T2><T3))) || T2,T3:Types(T1=@T3)}
12. | |{UNARY(p_expression(T2), p_expression(@(T2><T3))) || T2,T3:Types(T1=@T3)}
)

The above defines fully parenthesized expressions.

5. parenthesis_removal::={(X ! a ! B ! a ! Y , X ! a ! "(" ! B! ")" ! a ! Y) || T:type, X, Y:#char,
6. a:infix(T),b:infix(T),B:INFIX(p_expression(T),b)( b has_precedence_over a ) }

Example: "1+(2*3*4)+5" parenthesis_removal "1+2*3*4+5" and not "(1+2)*3" parenthesis_removal "1+2*3".

7. typed_expression(T)::=p_expression(T).do( parenthesis_removal).

8. |-p_expression(T)==>typed_expression(T).

The semantics of MATHS expressions are trivial when formalized in MATHS. The correct meaning of a MATHS expression looks (in MATHS) like itself . Spivey has remarked on a similar phenomena in the attempt to define Z in terms of set theory and set theory using Z. The denotation (map from syntax into semantics) is the identity mapping. Such a circular definition can be shown to be useless for ignorant readers because ignorance is a trivial fixed point[Spivey 88]. Luckily, complete definition is not essential for a human being to learn a language.

Future research can establish the semantics in terms of other languages - such as Z, Larch, .... The syntax and a prossibly feasible way to parse expressions is a notational problem. [ Notation.12 Expressions ]

}=::EXPRESSIONS

. . . . . . . . . ( end of section Typed Expressions) <<Contents | End>>

### Types as a Data Model

#### Formal Model

##### Entities and Maps
1. DATABASE::=
Net{
Collection of named types, objects, and maps.
1. Uses lexemes, type_theory, EXPRESSION.
2. T::Finite(Types)=Entity types.
3. |-For all t:Types, t.μ=t.denotation in
4. expressions(t)->objects(t).
5. D::=Entity descriptions. Object classes::=Finite(|[t:T]expressions(@t)).
6. For A:D, E(A)::=Named objects in A:Finite(expressions(A)).

7. For A,B:D, F(A,B)::=`Named maps from A to
8. B`:Finite(expressions(B^A)).
9. For no A, Id(A) in F(A,A).

##### Paths Through the DataBase.
10. For A,B:D, S(A,B)::@(A,B)=μ(F(A,B)) | {R:@(A,B)|| /R in μ(F(B,A))}.
11. For A,B, P(A,B)::@(A,B)=paths from A to B,
12. For A,B, P(A,B)::@(A,B)=S(A,B) | { s;p ||for some C:D, s:S(A,T3), p:P(C,B)},
13. L(A,B)::@(expressions(B^A))=/μ( { s1;s2;p || for some C1,C2:D, s1:S(A,C1), s2:S(C1,
14. C2), p:P(C2,B)}~Id(A)
15. ). ...
16. connected::@=for all A,B, if A<>B then some(P(A,B)).
17. For A,B, redundant(A,B)::@=some(F(A,B) & L(A,B)).
18. redundant::@=for some A,B(redundant(A,B)). ).
19. For A,B, M(A,B)::basic maps=F(A,B)~L(A,B) . ...
20. conceptual_model::=Bachman Diagram,
21. conceptual_model::=digraph(
22. vertices=>D ,
23. edges=> {(v1,v2) || for some A,B:D, m:M(A,B)(v1=names(A) and
24. v2=names(B) )
25. ). ...
26. conceptual_graph::=digraph(
27. vertices=>D | |[A:D]E(A) | |[A,B:D]F(A,B),
28. edges=> { (e, A) || for some A:D, e:E(A) }
29. | {(m,B) || for some A,B:D, m:M(A,B) }
30. | {(A, m) || for some A,B:D, m:M(A,B) }
31. }
32. ). ...

paths, prime_links, links::(names(D))^( names(D)><names(D)). For T1,T2, prime_links(names(T1), names(T2)=names(F(T1,T2))| { "("! l1 ! ![i:2..pre(l)](","!l[i])! ")"||for some C:#D, f:F(T1,T2), l:#char(l=names o f and T2= ><C)}.

For n1,n2:names(D), paths(n1,n2)=links(n1,n2) |{ l!".."!p || for some n3:names(D), l:links(n1,n3), p:paths(n3,n2)}.

For T2,

33. retrieval(T2)::=|[T1:D](entries(T1)!".."!paths(T1,T2)).

It is easy to set up a map from paths into relations.

##### Categorical Model
34. For T1,T2, p:P(T1,T2), arrow(p)::=(T1, p, T2). CATEGORY(Objects=>names(D),
35. Arrows=>{arrow(p)||for some T1 ,T2:D (T1=T2 and p=Id(T1)
36. or p in paths(T1,T2
37. )}.

##### Conceptual Model
38. conceptual_model::=digraph(vertices=> names(D)| |[T:D]names(E(T))|
39. |[T1,T2:D]names(T1,T2),
40. edges=>{(v1,v2) || objects(v1)in classes(v2)
41. or maps(v1) in B(dom(v1),cod(v1) and
42. cod(maps(v1))=classes(v2)
43. or maps(v2) in B(dom(v2), cod(v2)) and
44. dom(maps(v2))=classes(v1)
45. }
46. ).
47. rationalized::@=no cycles( conceptual_model).

if rationalized then conceptual_model in DAG.

}=::DATABASE.

##### Relationships with a key
Suppose we have a network
2. N:=Net{v1:V1, v2:V2,...}, and k and d are two lists of variables in v1,v2,v3,...
3. k,d:#|(v). where k do not overlap:
4. no |k & |d. Define T:=\${k1:k1(N),...d1:d1(N),...}, then (@N k -> d) ::=sets in @T with k determining d.

The above is sometimes called a functional dependency. These are worth noticing since they indicate the logical structure that underlies the original design.

The process starts with a set S:@N and finding a k and d such that

5. S in (@N k->d). If k is large as possible and d is all the other variables then k is a suitable key or identifier for a database. If niether S.k nor S.d themselves contain any functional dependencies then the set S is in third normal form(TNF, 3NF,...). k nor

##### Example -- university
6. U::=
Net{
sid, name, address, sched_number, course, time, course_number, course_name, dept, units,...::string.

}=::U.
The following would some of the dependencies that we can use to argise a model of this university:
7. courses::@U(course)->(course_number, course_name, dept, units).
9. enrolment::@U(sched_number,sid)->().
10. class::@U(sched_number)->(course, time).

1. ...

#### Rationalisation

It is possible to express any finite collection of structured sets as a network database by applying rules like those in the LDST step in SSADM. Here are some examples,
2. For all A:D, C:#D, if A= ><(C) then for all i:pre(C), some
3. I:/μ(i.th)( I in F(A, C[i])).
4. For all A:D, s:Documentation, (v,B):declarations(s), if A=s then v
5. in F(T,B).
6. For all A,B:D, if A==>B then "Id" in F(A,B).
7. For all A,B:D, I:/μ(in.@(A,B)), if B==>@A then I in D and "1st"
8. in F(I, A) and "2nd" in F(I, B).
9. For all A,B:D,T:@@(A,B) then "1st" in F(T,A) and "2nd" in F(T,B).
10. For all A,B:D,X:Sets, if A=X->B then A in @@(X,B) and
11. "2nd" in F(T,B) and for some I:/μ(X)(I in D and "1st" in
12. F(I, A).
13. This last rule also applies when A=B^n, A=#B, ...

#### Data

For D1,D2:documentation, a:Arrow,
14. Net{D1} a Net{D1}={R:@\${D1, D2}| (R) \${D1} a \${D2}.

For two lists of declarations D1, D2, Data{D1. D2}::=Net{D1}->Net{D2}.

Since D1 and D2 have no definitions or wffs then

15. Data{D1. D2}=@{ Net{D1. D2} | for all D1, one D2}

If we are discussing fixed set of variables and types with unique identifiers then this is abreviated further - suppose we have a DICTIONARY; DICTIONARY=Net{ v1:T1, ...} and let type(v[i])=t[i], then Data{ k1,k2,...,k[n]. d1,d2,..., d[m]} ::=@DICTIONARY(k1,k2,...,k[n])->(d1,d2,...,d[m]).

#### Data Base Notation

If D is a list of declarations then \/{D} is a discriminated union and so a natural model for a data base. The components of D are the different types of data/entities.

Open Example Data Base -- University

16. U::=Net{from previous example ...}.
17. UNIVERSITY::=\/{
19. section::@U(sched_number)->(course, time).
20. class::@U(courses)->(dept, course_number, course_name, units, ...).
21. enrollment::@U(sid,sched_number)->(...).

}=::UNIVERSITY.

{ student(123-45-6789, "John Doe", ...),

2. student(234-56-6789, "Jane Roe", ...),
3. section(12345, CS121, (MWF, 0800..0910)),
4. section(13244, CS121, (TT, 0800..1000)),
5. course(CS121, CS, 121, "Computers and Society", 2),
6. enrollment(234-56-6789, 12345, ...),
7. ... } in @UNIVERSITY.

. . . . . . . . . ( end of section Example Data Base -- University) <<Contents | End>>

### Normalisation of data.

#### Purpose

The aim of normalization is to document all the structure in some given data in terms of simple sets and mappings. Or, in McCathy's terminology, the purpose is to derive the abstract syntax from the concrete syntax. A third view is that we wish to discover and name all the implicit relationships. Finally we can say that normalization derives a standard semantic domain to model what is known about some data. The effect is a repeated partioning of an initially homogeneous set of n-tuples into different classes and types.

Normalization is an analytical tool more than a design tool. The goal is to model part of the problem not to work out detailed solutions. Design has to be concerned with performance, physical limitations, and ways to encode the necessary data. Relational data is a starting place for design and a useful user interface. They are simple to understand and use but tend to not use machinary efficiently. The relational retrieval systems are also limmitted in power[]. However they are a high powered analytical tool separating an area into many simple relations. These relations are a canonical basis for design work.

#### 0NF, 1NF, 2NF, and 3NF

First, second, and third normalization steps operate on a set of sets of simple types by repeatedly replacing one type by two or more types with the subsets of the variables. The names of the variables are unique and their domains don't change. Often the domains are strings. Hence the domains are omitted.

One starts with samples, displays, or existing data structures. Abstraction starts by naming component data and their domains. Then a plausible key is postulated and the result normalized until all functional dependencies are represented by key->data relationships. The 3 step process goes thru an easy to recall pattern : the key, the whole key, and nothing key.

1. NORMALIZATION::=following,
1. (0NF): A set of dictionaries, each listing a set of variables describing some sample forms, records, data structures, ... No duplicated variable names.
2. (1NF): For each dictionary choose a subset of the variables that can act as a primary key for that set of data. Extract repeating groups of data into new dictionaries. Make sure that each key (set of variable values) identifies a unique item in the data set.
3. (2NF): The Whole key. Make sure that for each key no part of the key can identify part of the data.
4. (3NF): Nothing But the Key. Make sure that no part of the key identifies another part of the key.

Notice there are further normalization steps that can be applied to further expose more hidden data structures.

#### Formal model

1. DICTIONARY::=Net{v1:D[1], v2:D[2], v3:D[3], ..., v[n]:D[n]}.
2. samples::@DICTIONARY.
3. @DICTIONARY(k1, k2, ...)?-?(d1, d2, ..).
4. samples >--\/{
5. a::@DICTIONARY(...)->(...).
6. b::@DICTIONARY(...)->(...).
7. c::@DICTIONARY(...)->(...).
8. ...
}

. . . . . . . . . ( end of section Formal model) <<Contents | End>>

#### Notation

30. Data{ key1, key2, key3, ... . non_key1, non_key2, ...}

#### Example Data Base: Shopping

Consider the data collected when grocery shopping. Here is the analysis of a shopping list as part of the work leading upto a "shoppers assistant" for an electronic organizer or personal digital assistant.

The data on the list are

1. shopping_list_data::=
Net{
product_name, shop::string. price, tax, total::money.
1. quantity::Nat0.
2. date::Dates.
3. time::Times.
4. bought::@.

}=::shopping_list_data.

Now select a key: product_name samples>-- \/{

2. s0::Data{product_name. shop, price, tax, total,quantity, date, time, bought}
}.
Then remove repeating data/many-many relations: samples>-- \/{
31. s1::Data{product_name.}.
32. s2::Data{product_name, date, shop. quantity, price, bought}.
33. s3::Data{date, shop. tax, total, time}.
}
There are no part key dependencies. There are no non-key dependencies.
2. database::=\/{
3. Products::Data{product_name:string.}.
4. Orders::Data{product_name:string, date:Date, shop:string. quantity:Nat0, price:money, bought:@}.
5. Shopping_trips::Data{date:Dates, shop:string. total, tax:Money, time:Times}.

6. |-product_name in Orders->Products.
7. |-(date,shop) in Orders->Shopping_trips.

}

Here is the analysis of a receipt from a cashier at a local supermarket:

• RECEIPT::=
Net{
name_of_store, register_n, operator_n, payment_method::string.
1. line_item::Data{product_desc:string.
2. unit_price:Money, qty:Nat, total_price:Money, tax_flag:@}.
3. date::Dates.
4. time::Times. subtotal, tax_paid, ammount_tendered, change::Money.

}=::RECEIPT.

• 0NF_RECEIPT::=\/{r1:@RECEIPT(name_of_store, date, time)->(...).}

Extract repeated group of line items at the store.

• 1NF_RECEIPT::=\/{
• r1::@RECEIPT(name_of_store, date, time)-> (reg_n,op_n, subtotal, payment_method, ammount, change, paid).
• r2::@RECEIPT(name_of_store, date, time, product_description)-> (qty, unit_price, total_price, tax_flag).
} Now part-key dependencies:
• 2NF_RECEIPT::=\/{
• r1::@RECEIPT(name_of_store, date, time)->(reg_n, op_n, subtotal, payment_method, ammount, change, paid).
• r2a::@RECEIPT(name_of_store, date, product_description)->(unit_price, tax_flag).
• r2b::@RECEIPT(name_of_store, time, product_description)->(qty, total_price).
} The are no non-key dependencies,
• 3NF_RECEIPT::=2NF_RECEIPT.

data_base{

• Visit_to_store::Data{name_of_store, date, time. reg_n,op_n, subtotal, payment_method, ammount, change, paid}.
• Pricing_at_store::Data{name_of_store, date, product_description. unit_price, tax_flag}. .Note Stores have different product descriptions and so even though the state determines taxes, the relationship between product_description and tax will be different in different stores. This would not true if something like the UPC was used as a key

##### Example -- Universal Product coding
• UPC_based_data::=
Net{
1. Pricing_at_store::Data{name_of_store, date, upc. unit_price}.
2. UPC::Data{upc. product_description, bar_code, ...}.
3. Taxing::Data{upc, date. tax_flag}.

}=::UPC_based_data.

• Bought:Data{name_of_store, date, time, product_description. qty, total_price}.

• |-(name_of_store,date) in Bought->Visits_to_store.
• |-(name_of_store,date,product) in Bought->Pricing.
}

#### Multi-valued Dependencies

.Reference Wu 92, Margeret S. Wu, The Practical Need for a Fourth Normal Form, 23rd SIGCSE Tech Symp on CSEd 92, SIGCSE Bull V24n1(Mar 92)pp19-23

##### Example -- Hospital
Physicians work at a location and are members of groups. The following is in 3NF (BCNF):
1. Physician_work::@\${Physician_id, Group_id, Location_id:Identifiers}. However it is not in 4NF because (in this hospital) if a physician p is member of groups g1 and g2 and works at l1 and l2 then 4 tuples have to be used to record all the relationships:
2. (p,g1,l1)
3. (p,g2,l1)
4. (p,g1,l2)
5. (p,g2,l2) If the physician leaves g1 then two tuples are removed: (p,g1,l1)
6. (p,g1,l2). Hence to get 4NF the relationship physician_work must be split, as follows:
7. membership::Data{Physician_id. group_id}. and
8. works_at::Data{Physician_id. location_id}. So
9. membership={(p,g1), (p,g2),...}.
10. works_at={(p,l1), (p,l2),...}.

Let R be a subset of type T with maps x, y, and z into x:T->X, y:T->Y, and z:T->Z. A multivalued dependency MVD(x,y,z) occurs iff there is a value of x where whenever two different tuples have the same x and different y and z values then there are two more tuples in R.

For Types T, R:@T,x,y,z:#variables(T),

11. R in MVD(x,y,z)::=for all r1, r2:R, k:=r1.x, if r2.x=k and r1.y<>r2.y and
12. r1.z<>r2.z then for some r3,r4(r3.(x,y,z)=(k,r1.y,r2.z) and
13. r3.(x,y,z)=(k,r2.y,r1.z)).

An MVD(x,y,z) is trivial if y is subset of x (|y==>|x) or x and y form a complete set of variables of T (|(x!y)=variables(T)).

A set of relations in BCNF is in 4NF iff all MVDs are trivial.

. . . . . . . . . ( end of section Normalisation of data) <<Contents | End>>

### Quantification and Verification ...

• DATABASE::=
Net{
...
1. For A,B, p:P(A,B), arrow(p)::=(A, p, B). CATEGORY(Objects=>names(D),
2. Arrows=>{arrow(p)||for some A,B:D (A=B and p=Id(A) or p in
3. paths(A,B))
4. }
5. )

...

6. For T1,T2:D, f:F(T1,T2), ratios(f)::={n:Nat0 || for some y:T2( n=|/f(y)| )}.
7. For T1:D, size(T1)::= |T1|. ...
8. For T1,T2:D, f:F(T1,T2), average_ratio(f)::= (+[y:T2](|/f(y)|))/|T2|.
9. For T1,T2:D, f:F(T1,T2), ok(f):@::=(average_ratio(f) in (|T1|/|T2|)*(0.90..1.10)

}=::DATABASE.

. . . . . . . . . ( end of section Types as a Data Model) <<Contents | End>>

## Types as Object-oriented objects

To Be Done!

. . . . . . . . . ( end of section Types as Object-oriented objects) <<Contents | End>>

## Notes on MATHS Notation

Special characters are defined in [ intro_characters.html ] that also outlines the syntax of expressions and a document.

Proofs follow a natural deduction style that start with assumptions ("Let") and continue to a consequence ("Close Let") and then discard the assumptions and deduce a conclusion. Look here [ Block Structure in logic_25_Proofs ] for more on the structure and rules.

The notation also allows you to create a new network of variables and constraints. A "Net" has a number of variables (including none) and a number of properties (including none) that connect variables. You can give them a name and then reuse them. The schema, formal system, or an elementary piece of documentation starts with "Net" and finishes "End of Net". For more, see [ notn_13_Docn_Syntax.html ] for these ways of defining and reusing pieces of logic and algebra in your documents. A quick example: a circle might be described by Net{radius:Positive Real, center:Point, area:=π*radius^2, ...}.

For a complete listing of pages in this part of my site by topic see [ home.html ]

## Notes on the Underlying Logic of MATHS

The notation used here is a formal language with syntax and a semantics described using traditional formal logic [ logic_0_Intro.html ] plus sets, functions, relations, and other mathematical extensions.

For a more rigorous description of the standard notations see

• STANDARD::= See http://www.csci.csusb.edu/dick/maths/math_11_STANDARD.html

## Glossary

• above::reason="I'm too lazy to work out which of the above statements I need here", often the last 3 or 4 statements. The previous and previous but one statments are shown as (-1) and (-2).
• given::reason="I've been told that...", used to describe a problem.
• given::variable="I'll be given a value or object like this...", used to describe a problem.
• goal::theorem="The result I'm trying to prove right now".
• goal::variable="The value or object I'm trying to find or construct".
• let::reason="For the sake of argument let...", introduces a temporary hypothesis that survives until the end of the surrounding "Let...Close.Let" block or Case.
• hyp::reason="I assumed this in my last Let/Case/Po/...".
• QED::conclusion="Quite Easily Done" or "Quod Erat Demonstrandum", indicates that you have proved what you wanted to prove.
• QEF::conclusion="Quite Easily Faked", -- indicate that you have proved that the object you constructed fitted the goal you were given.
• RAA::conclusion="Reducto Ad Absurdum". This allows you to discard the last assumption (let) that you introduced.