.RoadWorksAhead [click here Languages and Grammars if you can fill this hole]
The mathematical model of a language is a set of strings. The strings must be sequences of zero or more symbols taken from a finite set often called an alphabet.
See also
Indeed, we have a Ring with Union and Concatenation, but not with intersection and concatenation.
[click here codes, equivalence, etc. if you can fill this hole]
. . . . . . . . . ( end of section Properties of Languages) <<Contents | End>>
Grammars -- especially written in various extensions of Bachus-Naur-Form (BNF) -- are commonly used to define computer languages. Several examples can be found in my [ ../samples/ ] directory.
Different restrictions on the form of productions define different classes of languages. Many of these have been studied in detail.
Here is a table of some classic grammatical types.
The form of Productions is shown
with capital letters (A,B,C) being used for Nonterminals and lower case letters for
terminal symbols. Greek letters ( α, β, ...) are for strings of terminals and nonterminals.
Table
Name | Form of Productions | Notes |
---|---|---|
Regular | A->a, A->a B | Chomski |
Context Free | A->α | Chomski |
Conjunctive Grammars | A->α & β & ... | See [Okhotin03] and my own [ ../monograph/03-intersections.html ] |
Boolean Grammars | A->α & β & ... & \not γ & \not δ | See [Okhotin04b] |
Context_sensitive | α A β->α γ β | Chomski |
Context_dependent | β->α | Chomski |
No useful languages are context free - so more general form of grammars are needed. However CFGs are the commonest form of grammar used in practice. They provide useful information in Language Reference Manuals and the easiest way of designing compilers and interpreters. They are also the basis of Jackson Structured Programming.
Note this is more general than Chomski's definition of a CFG but describes the same family of languages.
word :: lexeme, purpose.or definitions,
name :: lexeme = string, purpose.
name :: lexeme = regular_expression, purpose.
Parameters are used to indicate correspondences across sequences, etc.
Example
Let
By default you can assume that a sequence is parsed as an array or list (of type #X for some X). Parameterized items become mappings from parameters to content(of type X<>->Y). For more, see [ notn_2_Structure.html ] [ Definitions in notn_14_Docn_Semantics ] [ notn_13_Docn_Syntax.html] , for an example of a how the formal grammar of MATHS maps into a structured objet.
A translation can be defined by giving two corresponding grammars. One defines the input, and the other the output. Tags indicate corresponding strings in the two grammars.
More sophisticated is to simultaneously define both the input and the output at the same time. Here each definition doesn't define a set of strings, but does define a set of pairs of strings (equivalent therefore to a relationship between two languages). A simple version of this idea will be found in Aho and Ullman Volume 1.
A parser constructs a data structure that encodes the input sequence. Each non_terminal in the grammar will have its own associated type of object in this data structure. A particular input is encoded as a set of objects which in generalwill not be a tree but a directed acyclic graph(dag). In MATHS a simple formal model might be to simultanously define the abstract and the concrete syntax like this:
(expression, expressions) ::@#Alphabet><Sets=(f:function e:delimitted | e:delimitted dot f:function,$ {f:functions, e:expressions}).
Clearly this involves much redundancy, and makes the definition harder to parse. A more readable convention is to place the abstract syntax imediately following the concrete syntax:
Thus sequence with tags leads to an object with tags identifying the components in it plus a special set of pointer tags identifying the object of which this object is part.
In (A|B|C|...) only one object will constructed, but it will be one of the alternative types indicated. The alternatives may or may not have overlapping component types. When the alternatives overlap, the grammar is ambiguous. You should then assume that the earlier alternatives take priority and a parser should look for the first alternative to fit.
In a concurrent expression the tags should all be distinct. From A & B & C ... The object is constructed by all parsings and so each concurrent part must have its own distinct tags. Thus each concurrent parsing operates on its part of the tpl, which can be implemented as a separately locked record. Thus the exact timing of the concurrent parsing is irrelevant.
In (A~B) it is assumed that B will be attempted before hand, and if it fails then its n-tpl is removed and/or replace by the n-tple (if any) described in A.
By extending the concept of simultaneous definitions two a n-ary relation between sets of input, outputs and objects a formal version of JSP is developed.
A grammar can be used to describe the possible and/or significant patterns of events in a software system. This is the basis of JSD and part of SSADM.
Proofs follow a natural deduction style that start with assumptions ("Let") and continue to a consequence ("Close Let") and then discard the assumptions and deduce a conclusion. Look here [ Block Structure in logic_25_Proofs ] for more on the structure and rules.
The notation also allows you to create a new network of variables and constraints. A "Net" has a number of variables (including none) and a number of properties (including none) that connect variables. You can give them a name and then reuse them. The schema, formal system, or an elementary piece of documentation starts with "Net" and finishes "End of Net". For more, see [ notn_13_Docn_Syntax.html ] for these ways of defining and reusing pieces of logic and algebra in your documents. A quick example: a circle might be described by Net{radius:Positive Real, center:Point, area:=π*radius^2, ...}.
For a complete listing of pages in this part of my site by topic see [ home.html ]
For a more rigorous description of the standard notations see