大學(xué)渣的ISLR筆記(4)-Classification

In this chapter we discuss three of the most widely-used classifiers: logistic regression , linear discriminant analysis , and K-nearest neighbors.

Unfortunately, in general there is no natural way to convert a qualitative response variable with more than two levels into a quantitative response that is ready for linear regression.

Logistic Regression


To avoid this problem, we must model p(X) using a function that gives outputs between 0 and 1 for all values of X. Many functions meet this description. In logistic regression, we use the logistic function:


we use a method called maximum likelihood to fit the model.

After a bit of manipulation, we find that:


is called the odds


The left-hand side is called the log-odds or logit.(邏輯回歸的命名原因在此)

We see that the logistic regression model has a logit that is linear in X.

Estimating the Regression Coefficients

maximum likelihood:This intuition can be formalized using a mathematical equation called a likelihood function


The estimates ?β0 and ?β1 are chosen to maximize this likelihood function.

Maximum likelihood is a very general approach that is used to fit many of the non-linear models that we examine throughout this book. In the linear regression setting, the least squares approach is in fact a special case of maximum likelihood.

Making Predictions

just calc it.

Multiple Logistic Regression


we use the maximum likelihood method to estimate β0, β1, . . . , βp.


Logistic Regression for>2 Response Classes

The two-class logistic regression models discussed in the previous sections have multiple-class extensions, but in practice they tend not to be used all that often. One of the reasons is that the method we discuss in the next section, discriminant analysis, is popular for multiple-class classification. So we do not go into the details of multiple-class logistic regression here, but simply note that such an approach is possible, and that software for it is available in R.(邏輯回歸也是可以適用多分類數(shù)據(jù)集的噢,只是不常用放案,LDA更常用于多分類數(shù)據(jù)集)

Linear Discriminant Analysis

In this alternative approach, we model the distribution of the predictors X separately in each of the response classes (i.e. given Y), and then use Bayes’ theorem to flip these around into estimates for Pr(Y= k|X= x ). When these distributions are assumed to be normal, it turns out that the model is very similar in form to logistic regression.

Why do we need another method, when we have logistic regression?

? When the classes are well-separated, the parameter estimates for the logistic regression model are surprisingly unstable. Linear discriminant analysis does not suffer from this problem.(看指數(shù)函數(shù)的S曲線就知道了)

? If n is small and the distribution of the predictors X is approximately normal in each of the classes, the linear discriminant model is again more stable than the logistic regression model.(訓(xùn)練集比較小而且分類數(shù)據(jù)分布遵從正太分布時(shí)邏輯回歸不好使)

? As mentioned in Section 4.3.5, linear discriminant analysis is popular when we have more than two response classes.(多分類)


In general, estimating πk is easy if we have a random sample of Ys from the population.(Let πk represent the overall or prior probability that a randomly chosen observation comes from the kth class)

Let fk(X) ≡Pr(X=x|Y=k) denote the density function of X for an observation that comes from the kth class.

Linear Discriminant Analysis for p= 1

Suppose we assume that fk(x) is normal or Gaussian . In the one-dimensional setting, the normal density takes the form:



The linear discriminant analysis (LDA) method approximates the Bayes classifier by plugging estimates for πk, μk, and σ2into (4.13). In particular, the following estimates are used:


The word linear in the classifier’s name stems from the fact that the discriminant functions ?δk(x) in (4.17) are linear functions of x(as opposed to a more complex function of x).


To reiterate, the LDA classifier results from assuming that the observations within each class come from a normal distribution with a class-specific mean vector and a common variance σ2, and plugging estimates for these parameters into the Bayes classifier.

Linear Discriminant Analysis for p >1

We now extend the LDA classifier to the case of multiple predictors. To do this, we will assume that X= (X1,X2, . . .,Xp) is drawn from a multivariate Gaussian(or multivariate normal) distribution, with a class-specific mean vector and a common covariance matrix.We begin with a brief review of such a distribution.


Quadratic Discriminant Analysis

Like LDA, the QDA classifier results from assuming that the observations from each class are drawn from a Gaussian distribution, and plugging estimates for the parameters into Bayes’ theorem in order to perform prediction.

However, unlike LDA, QDA assumes that each class has its own covariance matrix. That is, it assumes that an observation from the kth class is of the formXN(μkk), where Σk is a covariance matrix for the kth class. Under this assumption, the Bayes classifier assigns an observation X=x to the class for which


A Comparison of Classification Methods

Though their motivations differ, the logistic regression and LDA methods are closely connected.Consider the two-class setting with p= 1 predictor, and let p1(x ) and p2(x ) = 1?p1(x ) be the probabilities that the observation X= x belongs to class 1 and class 2, respectively. In the LDA framework, we can see from (4.12) to (4.13) (and a bit of simple algebra) that the log odds is given by:


where c0 and c1 are functions ofμ1, μ2, andσ2. From (4.4), we know that in logistic regression.


both logistic regression and LDA produce linear decision boundaries. The only difference between the two approaches lies in the fact that β0 and β1 are estimated using maximum likelihood, where as c0 and c1 are computed using the estimated mean and variance from a normal distribution.

This same connection between LDA and logistic regression also holds for multidimensional data with p >1.

Hence KNN is a completely non-parametric approach.


Finally, QDA serves as a compromise between the non-parametric KNN method and the linear LDA and logistic regression approaches.

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子日缨,更是在濱河造成了極大的恐慌蒿往,老刑警劉巖鸦致,帶你破解...
    沈念sama閱讀 211,561評論 6 492
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件,死亡現(xiàn)場離奇詭異腻惠,居然都是意外死亡,警方通過查閱死者的電腦和手機(jī)欲虚,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 90,218評論 3 385
  • 文/潘曉璐 我一進(jìn)店門集灌,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人复哆,你說我怎么就攤上這事欣喧。” “怎么了梯找?”我有些...
    開封第一講書人閱讀 157,162評論 0 348
  • 文/不壞的土叔 我叫張陵唆阿,是天一觀的道長。 經(jīng)常有香客問我锈锤,道長驯鳖,這世上最難降的妖魔是什么闲询? 我笑而不...
    開封第一講書人閱讀 56,470評論 1 283
  • 正文 為了忘掉前任,我火速辦了婚禮浅辙,結(jié)果婚禮上扭弧,老公的妹妹穿的比我還像新娘。我一直安慰自己记舆,他們只是感情好鸽捻,可當(dāng)我...
    茶點(diǎn)故事閱讀 65,550評論 6 385
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著泽腮,像睡著了一般御蒲。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上盛正,一...
    開封第一講書人閱讀 49,806評論 1 290
  • 那天删咱,我揣著相機(jī)與錄音,去河邊找鬼豪筝。 笑死痰滋,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的续崖。 我是一名探鬼主播敲街,決...
    沈念sama閱讀 38,951評論 3 407
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼严望!你這毒婦竟也來了多艇?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 37,712評論 0 266
  • 序言:老撾萬榮一對情侶失蹤像吻,失蹤者是張志新(化名)和其女友劉穎峻黍,沒想到半個(gè)月后,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體拨匆,經(jīng)...
    沈念sama閱讀 44,166評論 1 303
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡姆涩,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 36,510評論 2 327
  • 正文 我和宋清朗相戀三年,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了惭每。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片骨饿。...
    茶點(diǎn)故事閱讀 38,643評論 1 340
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡,死狀恐怖台腥,靈堂內(nèi)的尸體忽然破棺而出宏赘,到底是詐尸還是另有隱情,我是刑警寧澤黎侈,帶...
    沈念sama閱讀 34,306評論 4 330
  • 正文 年R本政府宣布察署,位于F島的核電站,受9級特大地震影響峻汉,放射性物質(zhì)發(fā)生泄漏箕母。R本人自食惡果不足惜储藐,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 39,930評論 3 313
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望嘶是。 院中可真熱鬧钙勃,春花似錦、人聲如沸聂喇。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,745評論 0 21
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽希太。三九已至克饶,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間誊辉,已是汗流浹背矾湃。 一陣腳步聲響...
    開封第一講書人閱讀 31,983評論 1 266
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留堕澄,地道東北人邀跃。 一個(gè)月前我還...
    沈念sama閱讀 46,351評論 2 360
  • 正文 我出身青樓,卻偏偏與公主長得像蛙紫,于是被迫代替她去往敵國和親拍屑。 傳聞我的和親對象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 43,509評論 2 348

推薦閱讀更多精彩內(nèi)容

  • 時(shí)光是個(gè)少年犯—何山 聽這首歌坑傅,應(yīng)該是兩年前了僵驰。 夏夜冗長燥熱,時(shí)值暑節(jié)唁毒,厭煩終日在家無所事事蒜茴,吃吃睡睡沒完沒了,...
    南逢酒館閱讀 611評論 0 0
  • “仁”是美育的核心 春秋時(shí)期我國偉大的思想家浆西、教育家孔子先生粉私,就在他教育實(shí)踐與理論的探索中,形成了獨(dú)特的美育思想室谚。...
    清涼世界雨閱讀 670評論 0 24
  • “要我說你就是矯情”程峰喝了一口咖啡秒赤,身子斜靠在三十樓的落地窗戶上,窗外的陽光照在他的臉上憎瘸,不是很帥的臉上入篮,瞬間充...
    喵落紅塵閱讀 296評論 0 2
  • 報(bào)名做這次的助教,我完全不知道會(huì)經(jīng)歷什么幌甘,只是知道一定會(huì)有沒經(jīng)歷過的體驗(yàn)潮售,而生命的成長痊项,需要體驗(yàn)。 當(dāng)來到營地酥诽,我...
    成功的種子閱讀 1,007評論 0 5