The key point is to show that intersection do allow the definition of some (but not all) context sensitive languages that are not context free, and that these languages have interesting parallel acceptors.
This is extracted from Chapter 3 of a monograph on software development methods. It uses some notation developed in chapter 2. The notation is described in [ math.lexemes.html ] and [ math.lexicon.html ] plus [ math.syntax.html ] and [ index in math.syntax ]
Two special symbols: "==>" stands for "subset or equal to" and "=>>" stands for "proper subset of".
<...formal definitions of CFGs ...>
Therefore
The get some input relation, x? is defined by For x:I, x? ::@(#I,#I)= rel [i, i':#i] ( i=x ? i' ), -- compare [LamShankar90].
It follows that
Notice that if R is a finite relational expression containing only (;), (|), (do) and x? then R represents the behavior of a non-deterministic finite state acceptor--(FSA). Hence there is an equivalent deterministic FSA. Similarly in reverse, for each FSA there is a finite relational expression made from x?, (|), (do) and (|).
It is obvious that the usual top-down parser will take a context free MATHS grammar and construct a program that accepts language. Each definition is a regular expression. There is a simple mapping of the dictionary into a set of definitions of relations that model the parsing of the input by the grammar.
Table of Mapping from Dictionary to Program
Dictionary Relation Notes
definition a::=b C_a::=B B is the translationof b.
terminal t t? t is removed from front of i
non_terminal a C_a C_a is a relation that consumes all of a
sequence a b c... A; B; C; ... Where a, b, c...translate to A, B, C, ...
selection (a|b|c) (A|B|C|...) Where a, b, c...translate to A, B, C, ...
iteration #a do(A) Where a translates to AFor example P::=( a P b | #c) becomes C_P::=(a?; C_P; b? | do(c?)).
To show
Proving that the translation of a grammar into a top-down parser always produce a relation that accepts exactly the language described by the grammar is a straight forward induction on the structure of the expressions. The translation is a homomorphism between the two semi_rings.
In practice a programmer implementing an accepting relation can reduce the amount of backtracking by adding conditions like the following:
A CFL is accepted by a non-deterministic PDA, where a PDA(Push-Down-Automata) has the following structure: <...> It is easy to show that if R is defined by a simple recursion (like a context free grammar) then it can be implemented as a non-deterministic PDA.
2.5 Acceptors & Dictionaries Next extend the construction of acceptors for grammars to acceptors for dictionaries including intersections. Consider the intersection (&R) of two or more simple acceptors R1, R2, R3, ... . This relation succeeds on those strings that are accepted by all of them.
Dictionary Relation Notes
intersection a&b&c... A&B&C&... Where a,b,c...translate to A,B,C,...
complement a~b A ~ B Where a and b translate to A and B.For instance consider:
Converting dictionary example above into an acceptor with two processes gives:
So, dictionaries define languages that are recognized by networks of CFG acceptors{footnote1}. Hemendinger shows that similar rules can be used for deriving grammars that model the communications that can occur between two Ada tasks
[Hemendinger90]. Notice that it is easy to transform acceptors into automata that report at each step whether or not the input is currently acceptableand then connect these to an "and-gate". <...>
This kind of structure - a Directed Acyclic Graph(DAG) of simple automata is found in naturally occuring recognisers - eyes, ears, and brains. Gelernter would calls it a trellis. It is known that any finite state automata can be decomposed into such a DAG of simple finite state automata and delay machines - where a simple automata is one that has a "simple group" [HartmenisandStearns]. However, there are mathematically simple automata of arbitrary size and complexity and so this result is not very helpful. Further it does not help us to understand what can be done by networks of recognizers that have stacks in them. This is the next topic.
<...more on my own metalinguistic MATHS language...>
A simple context free dictionary has a comparatively simple top-down acceptor. The correspondence has proved its value in practical projects [ DDD in newsubjects ]
Intersections in the dictionary lead to a network of parallel recognizers.
Two questions arise:
Net
The family of languages defined by these grammars is a superset of the Boolean closure of the family of context free languages and a proper subset of the context sensitive languages.
Defining a language as the intersection of several grammatical constraints is
not new - Constrained Expressions are expressive and based on simple regular
expressions
[Dillonetal86].
[HehnerSilverberg83] use the term
Communicating Grammars for a set of grammars with some terminals and
non-terminals occurring in both. The model here is simpler - I propose the
terms concurrent context free grammars -- -- --(CCFG)
for the dictionary model
proposed here, and concurrent context free languages-- -- --(CCFLs)
for the family
of languages defined by CCFGs. The Turing project used explicit intersections
[Holt & Cordy 88]. Hemendinger notes that Ada tasks have communication
patterns that are CCFLs
[Hemendinger90].
<formal model...>
Let CFL^0 be the regular languages and CFL^1 the context free languages.
A language is in
CFL^n if it is the intersection of n CFL^1 languages. A similar set of
definitions can be created for any family of languages which is not closed
under intersection. The context free dimension of a language L -- -- --(CFD)
is the minimum n for which L is a CFL^n.
Liu and Weiner defined CFL^k as the k-intersection languages - but failed to specify what the intersection of zero CFLs might mean. In this case by defining the regular languages as CFL^0 implies that if a language is the intersection of a CFL^p and CFL^q language then it is a CFL^(p+q) language.
Liu and Weiner remark that trivially
Clearly,
Therefore,
Liu & Wiener presented for all p>0 languages which belong in CFL^(p+1) and not CFL^p. This proves that for all p ( CFL^p=>>CFL^(p+1)) [Liu & Wiener 73, Hemendinger 90]..
For example the language
The first language exhibits two examples of what Jackson calls an "interleaving clash", and the next a single clash, the last has no clashes [Jackson75]. I therefore conjecture that the number of interleaved groups is equal to one less than the context free dimension of the data.
Liu and Weiner [LiuWeiner73] use a correspondence between the context free dimension of a language and the dimensionality of an associated linear space in their proof. They note that determining the context free dimension of an arbitrary language is often difficult. Finding G:CCFG with degree n merely establishes an upper bound on the CF dimension of the language.
There are well known polynomial time algorithms which recognize or parse a CFL^1 language. Hence CFL^n languages will be recognized by n parallel polynomial time algorithms - and so by a polynomial time algorithm that simulates the n parallel processes. The simple parallel structure proposed to handle intersections has no interaction between the processes. Each process is isolated from all others and conceptually has its own copy of the input. There is an implicit census taker waiting for all processes to accept their inputs before it reports success of the whole. This is an easy structure to implement in most high-level languages and well suited to modern programming styles with tasking, threads, coroutine, and/or objects.
There exist CSLs that are not CCFL. An example is {a^(n^2) || n:0..} which is a CSL. If it was a CFL^p for some p then it would be the intersection of p CFLs with a single terminal symbols. A CFL with a single non-terminal is regular and so the language must be the intersection of p regular languages. Therefore it is regular. However it is obviously irregular. Therefore
A similar argument shows that if L in CFL^p for some p, and t some terminal then
CFL^n is accepted by n parallel non-deterministic PDAs. Hence it follows
that CFL^n is closed under:
Net
CFL^p is not closed under union but CCFL is because
so that the union of a CFL^p and a CFL^q is a CFL(p*q) when p,q>0.
If we permit networks with a more general acceptance criteria - a boolean operation on the individuals then the number of PDA's required for the union of two such nets is the sum of the sizes of the two nets.
Other results depend on an exemplary type of language - languages defined by linear constraints [LiuWiener73]. Given an alphabet of N elements {a1,a2,a3,...,a[N]} and n::Nat0^N=(n1,n2,...n[N]), define a**n=![i:1..n](a[i]^n[i]).
Given alphabet of N elements {a1,a2,a3,...,a[N]} and an integer linear map M:Int^N -> Int^p and vector j[1..p] then
A countably infinite union of one set of CFL^m (m=1..) is {x! x || x:#@}. Now for finite p, {x! x || x:#@} is CSL~CFL^p. However
Therefore {x! x || x:@^n} is in CFL^m.
Clearly then {x! x || x:#@(|x|<=n)} is also CFL^m , and so as m tends to infinity we get a closer and closer approximation to {x! x || x:#@ } which is a Context Sensitive Language(CSL).
Philippe Schnoebelen{footnote2} circulated the Theory-Net on the BITNet(comp.theory on Usenet) with an interesting result equivalent to the observation that
It also appears there is no additional power added to language definitions by allowing intersections to be used arbitrarily - it is enough that they be permitted at the top level.
Lower level intersections in a grammar (which may be useful and convenient) are equivalent to allowing the network of PDAs to spawn new PDAs. We can get the same effect by keeping PDAs in a do-nothing loop until needed[cf pp81-88 of [Baetan90] by J A Bergstra]. It is clear that there is no limit to the number new PDAs. So allowing lower level intersections implies using either a growing network or a fixed network of finite but unbounded size.
Sakakibara's Extended Simple Formal Systems(ESFS) define a family of languages that lie between the CCFL and CSL [Sakakibara90], p91. Sakakibara also presents a Simple Formal System(SFS) which defines the language `{ a^(2^n) || n:Nat0}` [Sakakibara91]. This is clearly not definable in terms of a finite number of intersecting grammars. Using more advanced MATHS one of Sakakibara's SFSs would be written as Net{ P::#{a}->@, P(a), for all x:#{a}, if P(x) then P(x!x). }. Moreover, let
The Simple Formal Systems [ defined Sakakibara 91 p84] can define a language that is not CCFL, and yet can not define {a^n ! b^n ! c^n || n:Nat} which is CCFL [evidence in proof of proposition 7 on p90 of Sakakibara 91].
REG=CFL^0=>>CFL=CFL^1=>>CFL^N=>>CCFL=>>ESFS=>>CSL
CFL =>> SFS =>>ESFS
JSD -- -- --(Jackson System Development)
designs systems as a large network of objects
that belong in a small number of "classes"
[Jackson78]. Each class has many
similar but out of phase objects . In these designs, each dynamic object
keeps an occurrence of an entity up to date [ See bibliography of JSP/JSD,
[ JSP in newsubjects ]
[ JSD in newsubjects ]
Chapter 9]. It appears likely that these correspond to a subset of the CSLs.
. . . . . . . . . ( end of section 3 Grammar Theory) <<Contents | Index>>
(footnote1): Psychologists use a network of "daemons" like this for modelling perceptual processes [ Human Information Processing, Lindsay & Norman, Pub Academic Press, 1970].
(footnote2): [ EMail <mcsun!inria!imag!lifia!phs@UUNET.UU.NET>, 21 Oct 89 18:01:47 GMT]
. . . . . . . . . ( end of section Part of Chapter 3) <<Contents | Index>>