The branch of mathematics known as probability theory
is a central component of statistics. In fact, one is not far from the truth to
say that statistics is simply an extension, or branch, of probability theory. Rogerson (2001: 31-9) gives several examples of how
probability theory has useful planning applications above and beyond statistics. Given its centrality to statistics and
its usefulness to planners in addition to statistics, the lack of emphasis on
probability in planning education is odd. After all, probability theory is the
mathematics of uncertainty, planning is primarily about the future, and what is
more uncertain than the future?
The purpose of the discussion here is to give you a deeper
glimpse into the mathematics of probability. Although we will discuss these mathematics
here, the goal is not so much for you to learn them from this discussion (CyberStats and Rogerson
are more than adequate for that) as to have you understand the nature of the reasoning that underlies probability
theory and mathematics in general. In other words, the goal is to get you to think in a very fundamental way about
exactly what it is we do when we apply the mathematics of probability.
The Axiomatic Method
Typically, when we learn mathematics, we learn a series of rules. We start with addition and multiplication tables and work
our way up to solving simultaneous equations, calculating the areas of
geometric figures, finding the slopes of lines, etc. Students who do not go on
to study advanced mathematics, and even some who do, generally are not asked to reflect on the question, where do the rules come from?
For example, CyberStats Unit
B-2 and Chapter 2 in Rogerson (p.
24) both give such rules pertaining to probability. Where do these rules come
from? At least part of the answer requires an understanding the method of
mathematics. Modern mathematics uses a formal approach to reasoning called the axiomatic
method. This method aims to construct formal, logical systems
based solely on deductive reasoning. Notice
that here I said "systems," because the method does not lead to one
system but rather an infinite number of systems. Sometimes they are unrelated
to each other, as in the case of the systems in probability theory and those in
geometry; other times they are logically incompatible alternatives, as in the
case of Euclidean and non-Euclidean geometry. This opens up the possibility of multiple, mutually contradictory systems
that are all rigorously true on their own terms, that one can apply equally
well to a common domain of reality and scientific study, and yet that cannot be
used to disprove each other. Perhaps
the most famous way of saying this is Gödel's
incompleteness theorem, which demonstrates the impossibility of an
all-encompassing mathematical logic that can prove that all mathematics is
true.
The logical starting point in the axiomatic method is to identify a set of undefined terms known as primitives.
Then, using these primitives as a language, an axiomatic system proposes a set
of arbitrary propositions,
called axioms. Since they are
arbitrary assumptions, the number of axioms will ideally be kept to a minimum.
Then, using these axioms in combination with a few basic rules of
logic (i.e., the axioms of logic in general), the entire logical system is deduced. Note, however, that this
is purely the logical structure of the system. Sometimes mathematicians will
start with a proposition or domain of mathematical reasoning they wish to
establish and work backwards to identify the primitives and axioms that would
be sufficient to deduce the proposition or establish the domain.
Consider the terms and notation introduced in CyberStats Unit B-1
(Basics 1). At the start of this
section, we learn that an event is "any subset of the sample space."
Four paragraphs later, we learn
that the sample space is "the set of all possible simple outcomes that may
occur." Now if we go back to the original paragraph and look up the definition of an
outcome, we learn that an outcome is "the result of an event."
Therefore, our definition of event depends on the sample space, our definition
of sample space depends on the definition of outcome, and the latter depends on
the definition of event. Can you
say "circular reasoning"
boys and girls?
The Russian mathematician,
Andrey Kolmogorov,
first proposed the axioms that are the starting point for the logical system
that is modern probability theory. This system avoids the circularity we find
in CyberStats. Before
presenting Kolmogorov's axioms, I want to point out an unavoidable problem in
doing so. Since an axiomatic system is arbitrary, we can either use terms
borrowed from everyday usage to develop it, or invent our own terminology. Both
strategies have their pros and cons. The former strategy, borrowing from
everyday usage, has the advantage of
conveying an intuitive sense of the meaning
of a term. As we will see (Benton and Craib 2001:
Ch. 5-6, 8, and 10),
such borrowing is unavoidable. We must be careful to recognize that everyday
usage depends on our experience and ideas, which are always situated in a
particular historical, geographic, and cultural context, but mathematicians
deliberately use a method designed to (or just claimed to?) eliminate such
influences and rely purely on formal logic. One must not confuse mathematics’
borrowing of familiar language with the familiar things we refer to with that
language. For example, if a mathematician uses the word “experiment” as an
undefined primitive, the unwary student might erroneously equate this
essentially meaningless term with laboratories, test tubes, and so on. The
second strategy, inventing our own terminology, has the advantage of avoiding this
pitfall but the disadvantage of introducing a technical and difficult language (difficult
because it is unfamiliar and defined only in terms of a formal logical system).
See, for example, the Wikipedia
(Wikipedia 2005) presentation of Kolmogorov's
axioms. The discussion below uses the first strategy, but you must be careful
not to read in any other meanings than those introduced here.
Probability Axioms
Note that the following discussion uses terminology from set theory and number theory, themselves
branches of mathematics that are axiomatic. See, for example, axiomatic set theory.
Hence, probability theory is technically an extension of set and number theory.
Note also that the following uses the notation introduced in CyberStats Units B-1
and B-2, with the minor modification that it uses square brackets instead of
parentheses to indicate probabilities.
Primitives:
Outcome
Definitions:
Experiment ≡ something that yields a single
outcome from a set of outcomes
Sample space ≡ the set of all outcomes
associated with an experiment
Using the notation from CyberStats, we say, S ≡ {outcomes}
Axioms:
Axiom 1:
Let A be a symbol standing for any arbitrary outcome. In mathematical
notation, A Î
S. Define P[A] as “the
probability of A.” (In general, we will use the notation that P[×]
stands for “the probability of” whatever is in the brackets.) Then,
0 £ P[A]
£ 1.
This
axiom says that probabilities can only range from zero to one.
Axiom 2:
P[S] = 1.
This
axiom says that the probability of all the events in the sample space is one.
Axiom 3:
Let A and B be mutually exclusive members of S. then
P[A È
B] = P[A] + P[B].
This
says that if we have two completely distinct outcomes, the probability of both
of them equals the sum of their individual probabilities. Here, since we assume
A and B are mutually exclusive, “both” of them does not mean they are
combined together. Rather, it means one or the other, A or B.
You
may notice that Axioms 1 and 3 are identical to Rules 1 and 3 in CyberStats Unit B-2.
(With regard to Rule 3, Axiom 3 assumes mutual exclusivity, so P[A and B] = 0 by assumption.)
Now to underscore the
arbitrary nature of axiomatic systems, since I am the instructor I will add to
these three basic axioms by decreeing a fourth:
Axiom 4:
Taxes are purple.
Believe it or not,
everything else in probability theory can be logically derived from these four
(OK three) axioms!
Further Reading
Please read the Wikipedia’s entry on axiomatic
system (Wikipedia 2005). You will need to know this
to do a good job on Assignment 3.