.Open Probability
The Wikipedia has some excellent pages on traditional theories of probability.
This page has some non-traditional takes on probability theory. Plus one some useful
formulas.
It has notes on a particularly simple approach
that adds a kind of division operator to symbolic logic. The alternative
is the modern mathematical theory of
.See Measure Theory
(below)
which include probability as a special case.
.Open Theories
. Standard Axiomatic Theory of Probability
This is a MATHS approximation to the normal theory -- just some syntax and
some axioms, without getting into the semantics -- is probability a measure of belief or a limit of a frequency?
This is construction. Contact me with corrections.... Aug 28th 2012
Good_Probability::=following
.Net
This comes from page 34 of $Good50...
For p:$wff, Pr(p)::Positive & Real=`the probability of p being true`.
Pr(`coin came up heads when tossed`) = 0.5.
Pr(`coin came up tails when tossed`) = 0.5.
|-(B2):For p,q: $wff, if Pr(p and q) = 0 then Pr(p or q) = Pr(p)+Pr(q).
|-(B3): If (if p then q) then Pr(q) >= Pr(p).
|-(B4): Pr(true) <>0
|-(B5): for some p, Pr(p) = 0.
(Conditional Probability):
For p,h:$wff, if Pr(h)<>0, Pr(p/h)::Real=Pr(p and h)/Pr(h), `the probability of p, given h`.
`coin came up heads`/`coin tossed` = 0.5.
`coin came up tails`/`coin tossed` = 0.5.
.See http://en.wikipedia.org/wiki/Conditional_probability
.Close.Net Good_Probability
. Probability as an extension of Logic
Professor George published the following as part of his text book on logic
and cybernetics.
.Source George 77, Frank George, Precision, Language, and Logic, Pergamon Press, NY NY,p91
I've chased the approach back to John Maynard Keynes in the 1920's via
.See [RamseyFP60]!
It has this advantage of growing algebraically out of the propositional
calculus - almost as if it added a division operator to the set of logical
operators. The main disadvantage is that some theorems and axioms
(examples: $P4, $P5, and $P6) are
more complex because they have to expressed using a fraction.
It is
also a formal theory and so does not worry about what we mean by
"Probability". It just has the rules and assumptions a rational person
would be forced to adopt for giving values to propositions in a self-consistent .
Georgian_Probability::=Net{
For p,h:$wff, p/h::Real=`the probability of p, given h`.
`coin came up heads`/`coin tossed` = 0.5.
`coin came up tails`/`coin tossed` = 0.5.
Note: p/h <> h/p!
.See http://en.wikipedia.org/wiki/Conditional_probability
|-(P1):For p,h:$wff,0<=p/h<=1,
|-(P2): For p,h, if (if h then p) then p/h=1,
|-(P3): For p,h, if (if h then not p) then p/h=0.
|-(P4a): For p,q,h:$wff, (p and q)/h = (p/h)*(q/(p and h)) = (q/h)*p/(q and h),
|-(P4b): For p,q,h:$wff, (p or q)/h=p/h+q/h-(p and q)/h,
()|- (P5): p/(q and h)=(p/h)*(q/(p and h))/(q/h),
()|- (P6): if (if p then q) then p/(q and h)=p/h / q/h.
Notation
.See http://www/dick/maths/math_11_STANDARD.html#Serial operations
(STANDARD)|-For n:Nat, p:$wff^n, or(p) = p(1) or p(2) or ... or p(n).
(STANDARD)|-For n:Nat, x:[0..1]^n, +x = x(1) + x(2) + ... + x(n).
Local notational convenience/abuse of notation -- the `and` can be omitted.
For p,h, p h ::= p and h.
For n:Nat, partition(n):: @($wff^n), sets of n-tples of well formed formulas:
|- For n:Nat, partition(n)={ P:$wff^n || or(p) and for all i,j:1..n(if p(i) and p(j) then i=j)}.
Compare the above with
.See ./logic_31_Families_of_Sets.html#partitions
, set theoretic model of partitions.
()|-(Bayes): for p:partition(n), q,h:@, P:=map[i:1..n](q/(p(i) and h))*(p(i)/h)) (for all i:1..n, p(i)/(q and h)=P(i)/+P).
Yudkowsky_explains_Bayes_theorem::=http://yudkowsky.net/rational/bayes.
A useful function for calculating probabilities, that I name `norm`, which normalizes a tuple:
For X:Finite_set, P:X>->Real & Positive, norm(P) ::= map[x:X](P(x)/(+P)).
For example, see $Columbus_and_the_Birds and $Software_testing.
Columbus_and_the_Birds::=following
.Net
This example is quoted in Polya's excellent book "How to Prove It"
(see
.See ./logic_20_Proofs100.html#Heuristic Syllogism
).
.Box
If we are approaching land, we often see birds.
Now we see birds.
Therefore, probably, we are approaching land.
.Close.Box
So, we have something like this
If we are approaching land, we often see birds.
.Table q\p Near Land Far from Land
.Row See Birds 0.7 0.1
.Row No Birds 0.3 0.9
.Close.Table
Suppose that we think that there is 20% chance of being near land...
.Table Near Land Far from Land Total
.Row 0.2 0.8 1.0
.Close.Table
Then we see birds, then $Bayes suggests that we calculate
.Table - Near Land Far from Land Total
.Row Prior 0.2 0.8 1.0
.Row P[i] 0.7*0.2=0.14 0.8*0.1=0.08 0.22
.Row Normalize 0.14/.22=0.64... .08/0.22=0.36.. 1.0
.Row Post 0.64... 0.36.. 1.0
.Close.Table
So our belief we are near land should treble, having seen birds.
.Close.Net
Software_testing::=following
.Net
Suppose we have a piece of software (h) that may be correct (p) or may
have bugs (not p). We test the software and it may pass the test(q) or it
may fail. Now the probability of the test failing depends on whether
the software has bugs:
.Table - p not p
.Row q 1 0.9
.Row not q 0 0.1
.Close.Table
We are pretty good a writing software so we we put p/h = .9 and not p/h=0.1.
This means that we have a .9*1 + 0.1 * 0.9 = .99 chance of the tests succeeding.
Now if the test succeeds it should change the weight p/h by $Bayes
p/q h = (p/h * q/p h)/S,
not p/q h = (not p/h * q/not p h)/S,
S= (p/h * q/p h) + (not p/h * q/not p h).
So
p/q h = .9/S,
not p/q h = 0.09/S,
S=.99.
So
p/q h = .90909...
not p/q h = .090909...
Which means a successful test should improve our confidence that the software is
correct by a small amount -- from 90% to 91%. Of course, we can not repeat the
same test and get a similar improvement because the duplicated test is not
independent. We might make the case that a series of random tests were
independent and so our confidence in the software slowly tends towards 1.
If you do the math repeated independent tests tend towards convincing
us that the software is perfect, but there is always a small doubt left behind.
Worse, how do we know that the tests are independent...
.Close.Net
For more complex cases see BBN in my bibliography.
For h, Independent(h) ::@($wff, $wff)= rel[p,q]((p and q)/h = p/h * q/h ).
For h, disjoint(h) ::@($wff, $wff)= rel[p,q]((p and q)/h = 0 ).
.Hole
}=::Georgian_Probability.
. Measure Theory
MEASURE::=Net{
Space:Sets=given,
.See http://www/dick/maths/logic_31_Families_of_Sets.html
Set::@@Space=`measurable subsets of Space`.
Set::=given.
$Space and {} in Set.
|- For A,B:$Set, A|B and A&B in $Set.
measure::Set->Real [0..1]=given.
Notice that not all subsets of the space are given a measure. Doing that
leads to some paradoxes. Instead we have a `$Set` of measurable subspaces.
|- For A,B:$Set, measure( A | B )= measure(A) + measure(B) - measure(A & B).
|- measure(Space)=1.0.
|- measure({})=0.0.
discrete::@=(Set=@Space).
continuous::@=(for all a:Space(measure({a}) = 0.0) ).
For A,B:Sets, A independent B::@= ( measure(A & B) = measure(A) * measure(B)
.Hole
}=::MEASURE.
. Random Variables
Notation - using the Power of MATHS to express functions without special variables...
Teaching tends to present all random variables as real variables. All we need
in MATHS
however is a set and a $MEASURE on it -- mostly, until I get to pdf..
random_variable::=$ Net{
Values::Sets=given.
Range::=Values.
Set::=Values.
measure::$ $MEASURE(Set)=given.
discrete::@.
continuous::@= not $discrete.
|- if continuous then metric_space.
.Hole
}=::random_variable.
For X:random_variable, DF(X) ::measure(X)=`Probability Distribution Function`.
discrete_random_variable::=random_variable(discrete=true).
|- For X:discrete_random_variable, p:@(X.Set),Pr( p(X) ) = measure({x:Set(X)|| p(X)}).
|- For X:discrete_random_variable, op:{and, or, ...}, Pr( p(X) op q(X) ) = measure({x:Set(X)|| p(x) op q(x)} ).
|- For X:discrete_random_variable, Pr( p(X) || h(X) ) = measure({x:Set(X)|| p(X)})/measure({x:Set(X)|| h(X)}).
continuous_random_variable::=random_variable(discrete=false).
For X:continuous_random_variable,
PDF(X) ::measure(X)=`Probability Density Function`.
??{ Not easy to invent a generalization of the elementary case... and
the library is closed...
PDF(X) is a limit at X=x of the measure a small ball surrounding x divided by the size of that ball.
.Hole
}
real_random_variable::=random_variable with Set=Real.
For X:real_random_variable, PDF(X) = D DF(X).
.Hole
. Expected value
Expected values are very useful. A typical example if you have a 50%
chance of winning $100 in a bet vs a 50% chance of losing $90 then your
expected value will be
100*0.5 - 90 * 0.5 = 5.
So the bet is worth making...
I will use the notation `expect(v)` rather than the more common `E[v]`.
If you have a discrete random variable with distribution `p` and a function `v`
that can be applied to the random variable and returns a real value then
expect(v) ::= +(v*p).
If you have a continuous random variable with density `p` and a function `v`
that can be applied to the random variable and returns a real value then
expect(v) ::= integrate(v*p).
.Key Expectations
have properties and can (probably) be used as an alternate basis for
a theory of probability.
EXPECTATION::=following
.Net
X:Sets.
...
Values::@(X->Real).
|- for a:Real, X+>a in Values.
For v:Values, expect(v)::Real.
For u,v,w: Values, a:Real.
|-expect(u+v) = expect(u) + expect(v).
|-expect(a * v) = a*expect(v).
|-expect(a) = a.
The probability of a set A:@X can be expressed as an expectation
as long as the map `if A then 1 else 0 fi` is in Values:
probability(A)::= expect(A+>1|(X~A)+>0).
.Hole
.Close.Net
. Mean value
The mean value of a random variable is its expectation
\mu ::= expect ( (_) ).
. Moments
For r:1.., the r'th moment is the expected value of the r'th power of a random variable
(where it exists):
For r:Nat, \mu[r] ::= expect( (_)^r ).
. Population Standard deviation and Variance
These measure the spread of the distribution.
variance ::= \mu[2] - \mu[1]^2.
sd::=\surd(variance).
standard_deviation::=sd.
. Entropy
A measure of the information conveyed by a typical event.
H = expect (- lg(p) ), where p is the distribution or
probability density function.
. Bayesian
P(p) ::=`Degree of belief associated with proposition p`.
.Hole
. Frequentist
F(p) ::=`Frequency with each an event turns up`
.Hole
.Close Theories
.Open Classic Distributions
.Hole
.Open Discrete classics
.Hole
. Uniform
For n:Nat, uniform::1..n->probability= 1/n.
.Hole
. Binomial
For n:Nat, p:probability, q:=1-p, B::0..n->probability= fun[r](C(r,n)*p^r * q^(n-r)).
Where
C::=`Number of combinations of (2nd) things taken (1st) at a time`.
C(r,n) ::= n!/(r! * (n-r)!).
Where
n!::=`factorial n`.
n!= n*(n-1)*(n-2)* ... * 2*1.
0!=1.
For n>0, n!= (n-1)!*n.
.Hole
. Negative Binomial
.Hole
. Poisson
For m:Real, Poisson::Nat0->probability = map[r]( exp(-m)*m^r/r! ).
P::=Poison.
Poisson(r) ::=`the probability of r events occurring when they are very unlikely to occur at a particular time or place, but there are a lot of times or places when they could occur`.
The classic and delightful example being the number of Prussian officers
kicked to death by horses in regiments per year. To some extent the number
of goals in British professional soccer games is Poisson as well. Also the
number of mistakes I make when typing (as measured in 1967) was Poisson.
.Hole
. Geometric
For p:probability, q=1-p, G::Nat0->probability= p^(_)*q.
.Hole
. Hyper-Geometric
.Hole
.Close Discrete classics
.Open Continuous classics
. Pareto
.Hole
The 80-20 law
For x:[0..], Pareto(x)::= 1-(\gamma/x)**\beta.
. Weibull
.Hole
Weibull(x)::=1 -exp(-(x/\gamma)**\beta).
For Time t, defect_rate(t)::= N* a* _ * t**(a-1) * exp(_ * t**a).
. Exponential
p(x) is proportional to exp(-x).
.Hole
. Gaussian or Normal
When many small independent deviations are added up.
Standard Gaussian has a mean of zero and standard deviation of 1 and the PDF (symbolized by \phi) is
\phi(x) = exp(-x^2/2) / \surd(2*\pi).
In General if the mean is `m` and the standard deviation is \sigma then the PDF is
\phi( (x-m)/\sigma ).
.Hole
. Log-Normal
p(x) = if(x<0, 0, (1/\surd(2*p))*exp(-(ln(x)-5)**2/(2*s**2))).
.Hole
. \Chi\_square
Distribution of distances from 0 squared of normal variates.
.Hole
. Fisher's F
Ratio of \Chi^2s
.Hole
. Student's t
.Hole
.Close Continuous classics
.Close Classic Distributions
.Open Applications
.Hole
.Close Applications
. Glossary
wff::=expression(@), `well formed formula`.
.Close Probabilities