This set of notes is based some email from July 2010 sent to Keith Schubert, Jane Curnutt, Ernesto Gomez, Matt Strader, and Blake Schulte. It incorporates some of their comments and corrections From Jane Curnutt and Ernesto Gomez.
First draft, April 26th 2011. Corrected errors in definition of P and q below, December 4th, 2011.
This set of notes describe some well known tools for probabilistic analysis
as they apply to Cellular Automata. I will show how a random independent
distribution of live and dead cells evolves in one step. These are all well
known applications of probability theory -- the theories can be found on
Wikipedia. I'l first work out the case when each cell has a different
probability of being alive and then assume equal probabilities -- the so called
"Mean Field Model". These notes move on to the use of De Moivre's Theorem
to simplify the mean field calculations for large neighborhoods. They
finish by describing why these formulas can not predict the long term
evolution of cellular systems.
) is an array of cells. Each cell has a state. Each
cell has the same rule (program) that calculates the new state of the cell
from it's own state and the states of the cells in a given neighborhood.
In this set of notes the state is either alive or dead and the rule is
based on the number of live cells in the neighborhood -- an ignores which
neighbors are alive or dead. This is a widely studied case that forms a
primitive lifelike system.
A cell can be alive or dead. Let p:[0..1] be the probability of being alive and q = 1-p the chance of it being dead, then p' the
chance of it being alive in the next cycle, is
- p' = p * (1-Death) + q * Birth.
Where Birth and Death depend on the cells in the immediate neighborhood.
In a case studied by Jane Curnutt,
birth is when there is one live cell 1 and no death can happen:
- p' = p + q * P(number of live cells in neighborhood is 1).
For a moment I'll assume that each cell c has its own probability p(c) of being alive.... independently of other cells.
- q(c) = 1-p(c).
So if a cell c is alive, p(c) =1 and q(c)=0. If c is dead p(c)=0 and q(c)=1. The function
- p : Cells -> [0..1],
describes a field we can call the density.
Typically the probability of Birth or Death depends on the states of the
cells in the neighborhood N(c) around the cell c. Let n = |N(c)|, the
number of cells in the neighborhood N(c). A Totalistic Cellular Automate
we are interested in how many cells in N(c) are alive. So we need to
calculate the probability of r cells being alive in the neighborhood N(c).
Symbolize this as P(r) where r varies from 0 to n and indicates the number
of live cells in the neighborhood N(c) of cell c:
- P(r)::[0..1]= the probabillity that r cells are alive.
There is an elegant way to calculate all the P(r)'s by using a probability generating function (pgf) g:
- g(t)::= P(0)*t^0 + P(1)*t^1+... P(n)*t^n = Σ [r in 0..n](P(r)*T^r).
The pgf for a single cell c in the example CA is
A second example pgf: if c is alive the pgf is t, if dead it is 1, if the cell has a
50-50 chance of being alive, (1+t)/2.
The pgf of the sum of independent random variables is the product of the pgf's of the variables. So
- g(t) = Π [c in 1..n] (q(c) + p(c) t).
So with 8 cells in a neighborhood, suppose we have 1 alive, 5 dead and 2 cells with a 50-50 chance of being alive then
- g(t) = t * 1^5 * ((1+t)/2)^2
- = t/4 + t^2/2 + t^3/4.
So the distribution is ( 0, 1/4, 1/2, 1/4, 0, 0,...). In particular, the chance of birth in Jane's case is 1/4.
Now let us assume that we create the cells with an equal and independent
chance p of being alive (and let q=1-p). Then the pgf is
- g(t) = Π [c in 1..n] (q + p t)
- = (q + p t)^n, the binomial distribution.
Now consider the normal neighborhood of 8 cells and we get this (using PAscal's triangle):
- g(t) = q^8 * t^0 + 8 * p*q^7 * t^1 + 28 *p^2*q^6 *t^2 + 56*p^3*q^5 *t^3 + 70*p^4*q^4 *t^4+ 56*p^5*q^3 *t^5 + ... +p^8*t^8.
So for birth at one we have 8 * p*q^7. Substituting in the second equation above
- p ' = p + q *(8 * p*q^7) = p(1+8*q^8).
The curve rises rapidly to a peak of roughly 0.48 at density 0.15 (roughly) then enters a trough, then tends to p'=p. At all times p'>=p.
You can work out the curve for Conway's life using the same g(t) and and
the first equation. If I've done the algebra right
- p' = p^3q^5( 84q + 56p ) -- (Conway)
So when the Birth and Death depends on the number of live cells in the neighborhood, and
each cell in the neighborhood has the same probability p of being alive (a mean field model), and n= number of cells in a neighborhood, and q=1-p then
the Probability of r live cells in the neighborhood is
- P(r) = C(n,r) p^r q^(n-r),
- C(n,r)= n! / (r! (n-r)!).
the famous Binomial Distribution.
This lets you graph the probability, p', that a cell is alive in the next
step... Which in turn tells you the best density for a randomized start.
This formula is easy to use on small neighborhoods like Conway's 8 cell
neighborhood. Similarly the 2 cells next to a cell on a line, or
the 4 cells that share a boundary with a cell. It is not good for large
n.... like Ernesto Gomez's n=48 cell square.
However, De Moivre, many years ago discovered that there is a very good approximation to the Binomial distribution using the Gaussian Normal curve.
You can read about it on Wikipedia (nice animation).
I usually use two functions commonly called φ and Φ.
The φ is the normalized probability density function (pdf)
- φ(x) = exp(-x^2/2) / √(2*π).
Φ(x) is the integral of φ from -oo..x, the distribution function (df) of the normalized Gaussian curve. This is commonly expressed, in most languages in terms of the function erf (the error function) in a math library.
- Φ(x) = (1 + erf(x/√ 2) )/2.
Φ(x) gives a close approximation to P(r) when
- x = (r- n p)/√(n p q).
Note: the mean of the binomial distribution is n p (m) and the standard deviation is √(n p q) (s).
The Φ function is used for ranges of of rs. You can improve the approximation by adding 0.5 as a continuity correction to the argument for Φ.
For example, take Ernesto's rule (n=48), birth at b, death above d then ( x is the current density and x' the next density)...
- s:= √(48 x (1-x)),
- m:= 48 x.
- x' = (1-x) φ(b-m)/s + x Φ ( (d-m)/s)+0.5 ).
Here is a more complex example
with a range of birth birth values from b-15 to b and death after that....
- x' = (1-x) ( Φ ( (b-m)/s)+0.5 ) - Φ ( (b-15-m)/s)+0.5 ) ) + x Φ ( (b-m)/s)+0.5 ).
These models do a good job of predicting what happens if the live cells are distributed at random, with a constant density, and there is no dependence between them However Dr. Gomez's experiments demonstrate that after one step the adjacent sells are usually highly correlated. The plan looks lumpy with areas of high density and areas of low density. In many cases there is no return to randomness once this chaos has emerged.
(JC): Thesis CSUSB CSE 2010
(EG): Provate Communication
(Gosper): Paper or Thesis