[Skip Navigation] [CSUSB] / [CNS] / [Comp Sci Dept] / [R J Botting] / [ MATHS ] / math_63_Languages
[Contents] [Source Text] || [Notation] || [Copyright] || [Contact] [Search ]
Tue Sep 18 15:18:35 PDT 2007

Contents


    Languages

      .RoadWorksAhead

      The mathematical model of a language is a set of strings. The strings must be sequences of zero or more symbols taken froma finite set often called an alphabet.

    1. Languages::=following
      Net

      1. |-STRINGS
      2. Alphabet::Finite_set.
      3. Strings::=#Alphabet.
      4. Languages::=@Strings.

      (End of Net)

      See also

    2. STRINGS::= See http://csci.csusb.edu/dick/maths/math_62_Strings.html.

      Grammar

      A Grammar defines a language and imposes a stucture on the strings in that language. Chomski invented a model for how grammars work. A grammar describes (explicitly) rules for generating a correct strings in a given language. If well chosen each string in the language can be generated in essentially only one way. This way of generating the string defines the structure of the string - how it is parsed.

    3. GRAMMAR::=
      Net{
      1. Alphabet::Finite=given.
      2. productions::Sets.
      3. P::=@productions,
      4. terminals::@#Alphabet=given,
      5. T::= terminals,
      6. non_terminals::Finite@Name=given,
      7. N::= non_terminals=given,
      8. vocabulary::=T|N.
      9. V::=vocabulary.
      10. source::non_terminals=given,
      11. S::=source,

      12. meta_language::=#(production | comment),
      13. M::=meta_language,
      14. comment::=#character~production.
      15. |-P = img(M)&production.

      }=::GRAMMAR.

      Hierarchy of Meta_languages


    4. |-regular(T)=>>context_free_language(T) =>> para_grammar_language=>> extended_grammar_language.

      No useful languages are context free - so more general form of grammars are needed. However CFGs are the commonest form of grammar use din practice.

    5. Context_Free_Grammars::=CFG, below:
    6. CFG::=
      Net{
        Extended Backus-Naur Form is equivalent to a context free grammar(CFG). There exist standard tools for handling context free grammars.


      1. |-LEXICAL.


      2. |-GRAMMAR(production=>context_free_production).
      3. context_free_production::=non_terminal ":"":""=" regular_expression, Regular_Expressions(terminal|non_terminal).

      }=::CFG.

    7. Para_Grammar::=following,
      Net


      1. |-GRAMMAR(production=>para_production).
      2. para_production::= non_terminal ":"":""=" para_regular_expression,
      3. para_regular_expression::=(("&" | "|") parameters|) item #(("&"|"~") item),
      4. item::= (|hash)(terminal |non_terminal | "(" selection ")" | parameter),
      5. selection::= sequence #("|"sequence),
      6. sequence::= #(item | parameterized),

      7. parameterized::= parameters item,
      8. parameters::=l_bracket binding #(comma binding) r_bracket,
      9. binding::=parameter #(comma parameter) ":"item,
      10. parameter::=variable.

        Parameters are used to indicate correspondences across sequences, etc.

        Example
        Let

        1. dup::=|[w:#Bits ](w w).
        2. defines a context sensitive language in a very simple form.

        (Close Let )

      (End of Net Para_Grammar)

      Semantics

      The meaning of a grammar depends on how it is used - lexical or syntactical, .... Specialized languages can be constructed by using a grammar with different sets of terminal symbols and by interpretting the structures and tags differently.

      By default you can assume that a sequence is parsed as an array or list (of type #X for some X). Parameterized items become mappings from parameters to content(of type X<>->Y). For more, see [ notn_2_Structure.html ] [ Definitions in notn_14_Docn_Semantics ] [ notn_13_Docn_Syntax.html] , for an example of a how the formal grammar of MATHS maps into a structured objet.

      Lexical

      Tags should be put on sequences to indicate fields in the tokens generated by an associated lexer. A lexer is built by associating character input(lazy), token output and conditions with the grammar. All that is needed is to define the special non_terminal 'lexeme'.

      .Example GRAMMAR and

      Net{
      1. source::=#(lexeme|#character~lexeme),
      2. tokens::=#token
      3. TOKEN::=Net{t:{"N","W"}, s:#character, l:=len(s)}, (lexeme,token) ::=(n:number, (t=>"N", s=>n))|(w:word,(t=>"W",s=>w),
      4. number::=digit #digit,
      5. word::=letter #(letter|digit)
        }.

      Formatters

      The tags indicate fields in tokens input by the formatter and placed in sequence in the context of the untagged strings. A formatter is built by associating token input, character output and conditions with the grammar. The special non terminal 'output' needs to be defined.

      For example:

    8. Example::=following,
      Let
      1. TOKEN::=Net{t:{"N","W"}, s:#character, l:=len(s)},
      2. data::= #datum,
      3. output::=#line, (datum,line) ::=($ TOKEN(t=>"N"),"Number:" s:number EOLN)|($ TOKEN(t=>"W"),"Word:" s:word EOLN),
      4. number::=digit #digit,
      5. word::=letter #(letter|digit).

      (Close Let )

      Translation

      A translation can be defined by giving two corresponding grammars. One defines the input, and the other the output. Tags indicate corresponding strings in the two grammars.

      More sophisticated is to simultaneously define both the input and the output at the same time. Here each definition doesn't define a set of strings, but does define a set of pairs of strings (equivalent therefore to a relationship between two languages). A simple version of this idea will be found in Aho and Ullman Volume 1.

      Parsers

      A parser constructs a data structure that encodes the input sequence. Each non_terminal in the grammar will have its own associated type of object in this data structure. A particular input is encoded as a set of objects which in generalwill not be a tree but a directed acyclic graph(dag). In MATHS a simple formal model is two simultanously define the abstract and the concrete syntax - (expression, expressions) ::@#Alphabet><Sets=(f:function e:delimitted | e:delimitted f:function,$ {f:functions, e:expressions}).

      Clearly this involves much redundancy and so a convenient abreviation is to place the abstract syntax imediately folling the concrete syntax:

    9. expression::=f:function e:delimitted | e:delimitted f:function,
    10. expressions::=$ Net{f:functions, e:expressions}

      Thus sequence with tags leads to an object with tags identifying the components in it plus a special set of pointer tags identifying the object of which this object is part.

      In (A|B|C|...) only one object will constructed, but it will be one of the alternative types indicated. The alternatives may or may not have overlapping component types.

      In a concurrent expression the tags should all be distinct. From A & B & C ... The object is constructed by all parsings and so each concurrent part must have its own distinct tags. Thus each concurrent parsing operates on its part of the tpl, which can be implemented as a separately locked record. Thus the exact timing of the concurrent parsing is irrelevant.

      In (A~B) it is assumed that B will be attempted before hand, and if it fails then its n-tpl is removed and/or replace by the n-tple (if any) described in A.

      Data Directed Program Design

      By extending the concept of simultaneous definitions two a n-ary relation between sets of input, outputs and objects a formal version of JSP is developed.

      Dynamics

      GRAMMAR(Alphabet=>Events, terminal=>Events, non_terminal::= (Pattern|Entity(|"."Role))

      Grammars with embedded operations are programs.

      A grammar can have operations embedded in it but they then become non-deterministic data driven programs. In this case each definition (formally) defines two sets - a language and a relation. Strings in the language also satisfy the relation.

      Thus if U=$ Net{level:Nat0, S:@variables, bound_symbols:@Net{name:variable, at:Nat0, type:Types}} then we can describe the parsing of a set of bindings be a grammar on (@#Char, @(J,U)), with the properties that

      (A,R) (B,S) = (A B, R;S),

      (A,R)|(B,S)=(A|B,R|S),

      (A,R)~(B,S)=(A~B,R~S),

      and #(A,R) =(#A, do(R)).

      So (binding, note) ::=|[v:variable](v, not v in S;v|:S),

      (bindings, bind) ::=("",level'=level+1;S'={}) (binding, note) #((comma,Id) (binding, note)) ":" |[s:set](s,save(S,s)),

    11. save(S,s)::=while(S<>0)( S':~v'; (name=>v, at=>level, type=>Type(s))|:bound_symbols)

      [ Data directed Design] .

      UNIX Regular Expressions

    12. UNIX_RE::=
      Net{
      1. regular_expression::= sequence #(vertical_bar sequence),
      2. sequence::=(circumflex|)regular_item #regular_item (dollar_sign|),
      3. regular_item::=(element |"("regular_expression")")
      4. (star |plus | "{" number(comma number|) "}" ),
      5. element::=char~special_character | back_slash char | "[" set "]",
      6. set::=(circumflex|) #(char | char minus char ),
      7. special_char::=left|back_slash|vertical_bar|star|plus.

      8. There exist standard tools for encoding and interpretting regular expressions - see other.

      }=::UNIX_RE.

    Notes on MATHS Notation

    Special characters are defined in [ intro_characters.html ] that also outlines the syntax of expressions and a document.

    Proofs follow a natural deduction style that start with assumptions ("Let") and continue to a consequence ("Close Let") and then discard the assumptions and deduce a conclusion. Look here [ Block Structure in logic_25_Proofs ] for more on the structure and rules.

    The notation also allows you to create a new network of variables and constraints. A "Net" has a number of variables (including none) and a number of properties (including none) that connect variables. You can give them a name and then reuse them. The schema, formal system, or an elementary piece of documentation starts with "Net" and finishes "End of Net". For more, see [ notn_13_Docn_Syntax.html ] for these ways of defining and reusing pieces of logic and algebra in your documents. A quick example: a circle = Net{radius:Positive Real, center:Point}.

    For a complete listing of pages in this part of my site by topic see [ home.html ]

    Notes on the Underlying Logic of MATHS

    The notation used here is a formal language with syntax and a semantics described using traditional formal logic [ logic_0_Intro.html ] plus sets, functions, relations, and other mathematical extensions.

    For a more rigorous description of the standard notations see

  1. STANDARD::= See http://www.csci.csusb.edu/dick/maths/math_11_STANDARD.html

    Glossary

  2. above::reason="I'm too lazy to work out which of the above statements I need here", often the last 3 or 4 statements. The previous and previous but one statments are shown as (-1) and (-2).
  3. given::reason="I've been told that...", used to describe a problem.
  4. given::variable="I'll be given a value or object like this...", used to describe a problem.
  5. goal::theorem="The result I'm trying to prove right now".
  6. goal::variable="The value or object I'm trying to find or construct".
  7. let::reason="For the sake of argument let...", introduces a temporary hypothesis that survives until the end of the surrounding "Let...Close.Let" block or Case.
  8. hyp::reason="I assumed this in my last Let/Case/Po/...".
  9. QED::conclusion="Quite Easily Done" or "Quod Erat Demonstrandum", indicates that you have proved what you wanted to prove.
  10. QEF::conclusion="Quite Easily Faked", -- indicate that you have proved that the object you constructed fitted the goal you were given.
  11. RAA::conclusion="Reducto Ad Absurdum". This allows you to discard the last assumption (let) that you introduced.

End