A key concept in the ?eld of pattern recognition is that of uncertainty. It arises both through noise on measurements, as well as through the ?nite size of data sets. Probability theory provides a consistent framework for the quanti?cation and manipulation of uncertainty and forms one of the central foundations for pattern recognition. When combined with decision theory, discussed in Section 1.5, it allows us to make optimal predictions given all the information available to us, even though that information may be incomplete or ambiguous.
We will introduce the basic concepts of probability theory by considering a simple example. Imagine we have two boxes, one red and one blue, and in the red box we have 2 apples and 6 oranges, and in the blue box we have 3 apples and 1 orange. This is illustrated in Figure 1.9. Now suppose we randomly pick one of the boxes and from that box we randomly select an item of fruit, and having observed which sort of fruit it is we replace it in the box from which it came. We could imagine repeating this process many times. Let us suppose that in so doing we pick the red box 40% of the time and we pick the blue box 60% of the time, and that when we remove an item of fruit from a box we are equally likely to select any of the pieces of fruit in the box.
In this example, the identity of the box that will be chosen is a random variable, which we shall denote by B. This random variable can take one of two possible values, namely r (corresponding to the red box) or b (corresponding to the blue box). Similarly, the identity of the fruit is also a random variable and will bedenoted by F. It can take either of the values a (for apple) or o (for orange).
To begin with, we shall de?ne the probability of an event to be the fraction of times that event occurs out of the total number of trials, in the limit that the total number of trials goes to in?nity. Thus the probability of selecting the red box is 4/10?and the probability of selecting the blue box is 6/10. We write these probabilities as p(B = r)=4 /10 and p(B = b)=6 /10. Note that, by de?nition, probabilities must lie in the interval [0,1]. Also, if the events are mutually exclusive and if they include all possible outcomes (for instance, in this example the box must be either red or blue), then we see that the probabilities for those events must sum to one.
首先良哲,我們會定義概率是一個事件的發(fā)生次數(shù)在實驗總數(shù)的占比盛卡,在極限中,實驗的次數(shù)是趨于無限大的筑凫。因此選擇紅盒子的概率為4/10窟扑,選擇藍盒子的概率為6/10.我們將這些記為 p(B=r)=4/10 和 p(B=b)=6/10.注意,依據(jù)定義漏健,概率一定在[0,1]這個區(qū)間范圍內(nèi)嚎货。而且,如果事件是相互互斥的蔫浆,并且如果包含了所有的發(fā)生的可能殖属,那么這些事件的概率之和一定等于1.