Consider an experiment whose sample space is S. For each event E of the sample

space S, we assume that a number P (E ) is defined and satisfies the following three

conditions:

(i) 0 ≤ P (E ) ≤ 1.

(ii) P (S) = 1.

(iii) For any sequence of events E1, E2 , . . . that are mutually exclusive, that is, events

for which En Em = Øwhen n m, then

We refer to P (E ) as the probability of the event E .

Example 1.1 In the coin tossing example, if we assume that a head is equally likely

to appear as a tail, then we would have

On the other hand, if we had a biased coin and felt that a head was twice as likely to

appear as a tail, then we would have

Example 1.2 In the die tossing example, if we supposed that all six numbers were

equally likely to appear, then we would have

Remark We have chosen to give a rather formal definition of probabilities as being functions defined on the events of a sample space. However, it turns out that these probabilities have a nice intuitive property. Namely, if our experiment is repeated over and over again then (with probability 1) the proportion of time that event E occurs will just be P (E ).

In words, Equation (1.1) states that the probability that an event does not occur is one minus the probability that it does occur.

We shall now derive a formula for P (E ∪ F ), the probability of all outcomes either in E or in F . To do so, consider P (E ) + P (F ), which is the probability of all outcomes in E plus the probability of all points in F . Since any outcome that is in both E and F will be counted twice in P (E ) + P (F ) and only once in P (E ∪ F ), we must have

P (E ) + P (F ) = P (E ∪ F ) + P (EF)

or equivalently

P (E ∪ F ) = P (E ) + P (F ) − P (EF) (1.2)

Note that when E and F are mutually exclusive (that is, when EF = Ø), then

Equation (1.2) states that

P (E ∪ F ) = P (E ) + P (F ) − P (Ø) = P (E ) + P (F )

a result which also follows from condition (iii). (Why is P (Ø) = 0?)

Example 1.3 Suppose that we toss two coins, and suppose that we assume that each of the four outcomes in the sample space

S = {(H, H ), (H, T ), (T, H ), (T, T )}

is equally likely and hence has probability . Let

E = {(H, H ), (H, T )} and F = {(H, H ), (T, H )}

That is, E is the event that the first coin falls heads, and F is the event that the second coin falls heads.

By Equation (1.2) we have that P (E ∪ F ), the probability that either the first or the second coin falls heads, is given by

We may also calculate the probability that any one of the three events E or F or G occurs. This is done as follows:

P (E ∪ F ∪ G) = P ((E ∪ F ) ∪ G)

which by Equation (1.2) equals

P (E ∪ F ) + P (G) − P ((E ∪ F )G)

Now we leave it for you to show that the events (E ∪ F )G and EG ∪ FG are equivalent, and hence the preceding equals

P (E ∪ F ∪ G)

= P (E ) + P (F ) − P (EF) + P (G) − P (EG ∪ FG)

= P (E ) + P (F ) − P (EF) + P (G) − P (EG) − P (FG) + P (EGFG)

= P (E ) + P (F ) + P (G) − P (EF) − P (EG) − P (FG) + P (EFG) (1.3)

In fact, it can be shown by induction that, for any n events E1, E2 , E3, . . . , En ,

In words, Equation (1.4), known as the inclusion-exclusion identity, states that the probability of the union of n events equals the sum of the probabilities of these events taken one at a time minus the sum of the probabilities of these events taken two at a time plus the sum of the probabilities of these events taken three at a time, and so on.