A key concept in the ?eld of pattern recognition is that of uncertainty. It arises both through noise on measurements, as well as through the ?nite size of data sets. Probability theory provides a consistent framework for the quanti?cation and manipulation of uncertainty and forms one of the central foundations for pattern recognition. When combined with decision theory, discussed in Section 1.5, it allows us to make optimal predictions given all the information available to us, even though that information may be incomplete or ambiguous.
在模式識別領(lǐng)域的一個關(guān)鍵的概念是不確定性。它是由兩個原因引起的,一是在測量時的噪聲,二是有限的數(shù)據(jù)集绳矩。概率理論為我們提供了一個堅實框架较锡,可以定量的描述不確定性鸳吸,并且構(gòu)成了模式識別的基石脆荷。當(dāng)我們把概率理論和決策理論合并后,它可以讓我們基于可用的信息策橘,對預(yù)測進行優(yōu)化,盡管信息或許是不完整或者不清晰的娜亿。
We will introduce the basic concepts of probability theory by considering a simple example. Imagine we have two boxes, one red and one blue, and in the red box we have 2 apples and 6 oranges, and in the blue box we have 3 apples and 1 orange. This is illustrated in Figure 1.9. Now suppose we randomly pick one of the boxes and from that box we randomly select an item of fruit, and having observed which sort of fruit it is we replace it in the box from which it came. We could imagine repeating this process many times. Let us suppose that in so doing we pick the red box 40% of the time and we pick the blue box 60% of the time, and that when we remove an item of fruit from a box we are equally likely to select any of the pieces of fruit in the box.
我們將會使用一個簡單的例子來介紹一些基本的概率理論中的概念丽已。想象我們有兩個盒子,一個紅色买决,一個藍色沛婴,并且在紅色的盒子里有2個蘋果、6個桔子督赤;在藍色的盒子里有3個蘋果和1個桔子瘸味。如圖1.9,現(xiàn)在設(shè)想我們隨機的選擇一個盒子够挂,并從盒子里隨機的選擇一個水果旁仿,然后觀察是什么水果,然后返回我們拿水果的那個籃子里孽糖。我們可以想象枯冈,重復(fù)這個過程許多次。假設(shè)我們有40%的次數(shù)從紅盒子里拿办悟,60%的次數(shù)從藍盒子里拿尘奏,當(dāng)我們從盒子里取出一個水果時,我們同樣有可能選擇盒子里的任何一個水果病蛉。
In this example, the identity of the box that will be chosen is a random variable, which we shall denote by B. This random variable can take one of two possible values, namely r (corresponding to the red box) or b (corresponding to the blue box). Similarly, the identity of the fruit is also a random variable and will bedenoted by F. It can take either of the values a (for apple) or o (for orange).
在這個例子里炫加,這個盒子的選擇是一個隨機變量,記為B铺然。這個隨機變量的可能值為俗孝,r和b,分別代表紅色盒子和藍色盒子魄健。同樣赋铝,水果也是一個隨機變量,記為F沽瘦。它可能的值是革骨,a和o农尖。分別代表蘋果和桔子。
To begin with, we shall de?ne the probability of an event to be the fraction of times that event occurs out of the total number of trials, in the limit that the total number of trials goes to in?nity. Thus the probability of selecting the red box is 4/10?and the probability of selecting the blue box is 6/10. We write these probabilities as p(B = r)=4 /10 and p(B = b)=6 /10. Note that, by de?nition, probabilities must lie in the interval [0,1]. Also, if the events are mutually exclusive and if they include all possible outcomes (for instance, in this example the box must be either red or blue), then we see that the probabilities for those events must sum to one.
首先良哲,我們會定義概率是一個事件的發(fā)生次數(shù)在實驗總數(shù)的占比盛卡,在極限中,實驗的次數(shù)是趨于無限大的筑凫。因此選擇紅盒子的概率為4/10窟扑,選擇藍盒子的概率為6/10.我們將這些記為 p(B=r)=4/10 和 p(B=b)=6/10.注意,依據(jù)定義漏健,概率一定在[0,1]這個區(qū)間范圍內(nèi)嚎货。而且,如果事件是相互互斥的蔫浆,并且如果包含了所有的發(fā)生的可能殖属,那么這些事件的概率之和一定等于1.
。瓦盛。洗显。