如何防止機(jī)器學(xué)習(xí)模型的過(guò)擬合（翻譯筆記）

How to Prevent Overfitting in Machine Learning Models

Very deep neural networks with a huge number of parameters are very robust machine learning systems. But, in this type of massive networks, overfitting is a common serious problem. Learning how to deal with overfitting is essential to mastering machine learning. The fundamental issue in machine learning is the tension between optimization and generalization. Optimization refers to the process of adjusting a model to get the best performance possible on the training data (the learningin machine learning), whereas generalization refers to how well the trained model performs on the data that it has **never seen before ** (test set). The goal of the game is to get a good generalization. But, you don’t control generalization; you can only adjust the model based on its training data.
具有大量參數(shù)的深度神經(jīng)網(wǎng)絡(luò)是一種非常魯棒的機(jī)器學(xué)習(xí)系統(tǒng)猜憎。但是，這類大規(guī)模的網(wǎng)絡(luò)兰英，常常面臨過(guò)擬合的困擾匆笤。如何解決過(guò)擬合的問(wèn)題壤巷，是機(jī)器學(xué)習(xí)中必須掌握的一種能力。機(jī)器學(xué)習(xí)的中的基本問(wèn)題就是在最優(yōu)化和泛化之間的平衡。最優(yōu)化是盡量在訓(xùn)練數(shù)據(jù)集上盡可能取得最好的性能简肴。但是贮缅，泛化要求模型在從來(lái)沒(méi)見過(guò)的數(shù)據(jù)上表現(xiàn)出最好的性能榨咐。但是，我們無(wú)法控制泛化谴供，我們能做的就是再訓(xùn)練數(shù)據(jù)集上調(diào)整優(yōu)化模型块茁，讓模型盡可能提升泛化能力。

How do you know whether a model is overfitting?

**如何判斷模型是否過(guò)擬合？

The clear sign of overfitting is when the model accuracy is high in the training set, but the accuracy drops significantly with new data or in the test set. This means the model knows the training data very well, but can not generalize. This case makes your model useless in production or AB test in most domains.
模型過(guò)擬合最明顯的特征就是在訓(xùn)練集上高正確率数焊，但是在新數(shù)據(jù)集或測(cè)試集上正確率會(huì)明顯下降永淌。這說(shuō)明模型對(duì)訓(xùn)練數(shù)據(jù)集學(xué)習(xí)的太好了，但是無(wú)法泛化佩耳。這樣的模型就無(wú)法在實(shí)際產(chǎn)品或AB測(cè)試中使用遂蛀。

How can you prevent overfitting?

如何防止過(guò)擬合？

Okay, now let’s say you found that your model overfits. But what to do now to prevent your model from overfitting?
Fortunately, there are many ways you can try to prevent your model from overfitting. Below I have described a few of the most widely used solutions for overfitting.
如果我們發(fā)現(xiàn)了模型過(guò)擬合干厚，那如何解決過(guò)擬合的問(wèn)題呢李滴？
幸運(yùn)的是，有些辦法來(lái)阻止模型的過(guò)擬合發(fā)生萍诱。

1. Reduce the network size

1. 減小網(wǎng)絡(luò)尺寸

The simplest way to prevent overfitting is to reduce the size of the model: the number of learnable parameters in the model (which is determined by the number of layers and the number of units per layer).
解決過(guò)擬合最簡(jiǎn)單的方式是減小模型的規(guī)模悬嗓，也就是模型要學(xué)習(xí)的參數(shù)數(shù)量。這些參數(shù)決定了網(wǎng)絡(luò)層數(shù)和每層的單元數(shù)裕坊。

2. Cross-Validation

2. 交叉驗(yàn)證

In cross-validation, the initial training data is used as small train-test splits. Then, these splits are used to tune the model. The most popular form of cross-validation is K-fold cross-validation. K represents the number of folds. Here is a short video from Udacity which explains K-fold cross-validation very well.
交叉驗(yàn)證就是將訓(xùn)練集劃分為很小的多份訓(xùn)練測(cè)試數(shù)據(jù)集包竹，然后將這些劃分后的測(cè)試集用來(lái)微調(diào)模型。最受歡迎的是K-flod 交叉驗(yàn)證籍凝。K表示的是folds的數(shù)量周瞎， Udacity上有一個(gè)短視頻非常好的介紹了K-flod交叉驗(yàn)證.（我沒(méi)找到視頻鏈接）K-fold交叉驗(yàn)證也是非常常用的一種技術(shù)手段。

3. Add weight regularization

3. 增加權(quán)重規(guī)則化

Given two explanations for something, the explanation most likely to be correct is the simplest one — the one that makes fewer assumptions. This idea also applies to the models learned by neural networks: given some training data and a network architecture, multiple sets of weight values could explain the data. Simpler models are less likely to overfit than complex ones. A simple model in this context is a model where the distribution of parameter values has less entropy (or a model with fewer parameters). Thus a common way to mitigate overfitting is to put constraints on the complexity of a network by forcing its weights to take only small values, which makes the distribution of weight values more regular. This is called weight regularization, and it’s done by adding to the loss function of the network a cost associated with having large weights.
This cost comes in two flavors: L1 regularization — The cost added is proportional to the absolute value of the weight coefficients .
**L2 regularization **— The cost added is proportional to thesquare of the value of the weight coefficients. L2 regularization is also called weight decay in the context of neural networks. [1]
L1規(guī)則化：在權(quán)重系數(shù)絕對(duì)值上增加代價(jià)
L2規(guī)則化：在權(quán)重系數(shù)方差上增加代價(jià)

4. Removing irrelevant features

4. 刪除不相關(guān)的特征

Improve the data by removing irrelevant features. A dataset may contain many features that do not contribute much to the prediction. Removing those less important features can improve accuracy and reduce overfitting. You can use the scikit-learn’s feature selection module for this pupose.
通過(guò)刪除不相關(guān)的特征提升數(shù)據(jù)饵蒂。數(shù)據(jù)集中包含許多特征声诸，有些特征對(duì)最終的推理是沒(méi)有貢獻(xiàn)的。刪除這些不重要的特征退盯，提升正確率彼乌，降低過(guò)擬合。
可以使用scilit-learn‘s的特征選擇來(lái)實(shí)現(xiàn)渊迁。

5. Adding dropout

5. 網(wǎng)絡(luò)中增加dropout

Dropout, applied to a layer, consists of randomly dropping out(setting to zero) a number of output features of the layer during training. Let’s say a
given layer would normally return a vector [0.2, 0.5, 1.3, 0.8, 1.1] for a given input sample during training. After applying dropout, this vector will have a few zero entries distributed at random: for example, [0.2, 0.5, 1.3, 0.8, 1.1]
Dropout機(jī)制就是在訓(xùn)練過(guò)程中慰照，相關(guān)層中機(jī)刪除一定數(shù)量的輸出特征（這些特征被置為0）。比如琉朽，給定一個(gè)層毒租，訓(xùn)練過(guò)重中給定輸入一般返回的向量是 [0.2, 0.5, 1.3, 0.8, 1.1] ，如果應(yīng)用droput箱叁，隨機(jī)替換部分值為0墅垮，變成[0.2, 0.5, 1.3, 0.8, 1.1] 。
6. Data Augmentation

6. 增加數(shù)據(jù)

The simplest way to reduce overfitting is to increase the size of the training data. Let’s consider we are dealing with images. In this case, there are a few ways of increasing the size of the training data — rotating the image, flipping, scaling, shifting, etc. This technique is known as data augmentation. This usually provides a big leap in improving the accuracy of the model.
降低過(guò)擬合最簡(jiǎn)單的方式是增加訓(xùn)練數(shù)據(jù)量耕漱。理論上算色，如果訓(xùn)練數(shù)據(jù)集是全量數(shù)據(jù)，就不會(huì)有過(guò)擬合了螟够。增加訓(xùn)練數(shù)據(jù)的方法剃允，比如對(duì)圖像，可以通過(guò)旋轉(zhuǎn)，拉伸斥废，放大等方式。這些技術(shù)成為數(shù)據(jù)擴(kuò)張给郊，這種方法往往能大幅提升模型的能力牡肉。

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者

人面猴
序言：七十年代末，一起剝皮案震驚了整個(gè)濱河市淆九，隨后出現(xiàn)的幾起案子统锤，更是在濱河造成了極大的恐慌，老刑警劉巖炭庙，帶你破解...
沈念sama閱讀 221,820評(píng)論 6贊 515
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件饲窿，死亡現(xiàn)場(chǎng)離奇詭異，居然都是意外死亡焕蹄，警方通過(guò)查閱死者的電腦和手機(jī)逾雄，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 94,648評(píng)論 3贊 399
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)腻脏，“玉大人鸦泳，你說(shuō)我怎么就攤上這事∮榔罚” “怎么了做鹰？”我有些...
開封第一講書人閱讀 168,324評(píng)論 0贊 360
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長(zhǎng)鼎姐。經(jīng)常有香客問(wèn)我钾麸，道長(zhǎng)，這世上最難降的妖魔是什么炕桨？我笑而不...
開封第一講書人閱讀 59,714評(píng)論 1贊 297
?港島之戀（遺憾婚禮）
正文為了忘掉前任饭尝，我火速辦了婚禮，結(jié)果婚禮上谋作，老公的妹妹穿的比我還像新娘芋肠。我一直安慰自己，他們只是感情好遵蚜，可當(dāng)我...
茶點(diǎn)故事閱讀 68,724評(píng)論 6贊 397
惡毒庶女頂嫁案：這布局不是一般人想出來(lái)的
文/花漫我一把揭開白布帖池。她就那樣靜靜地躺著，像睡著了一般吭净。火紅的嫁衣襯著肌膚如雪睡汹。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 52,328評(píng)論 1贊 310
城市分裂傳說(shuō)
那天寂殉，我揣著相機(jī)與錄音囚巴，去河邊找鬼。笑死，一個(gè)胖子當(dāng)著我的面吹牛彤叉，可吹牛的內(nèi)容都是我干的庶柿。我是一名探鬼主播，決...
沈念sama閱讀 40,897評(píng)論 3贊 421
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼秽浇，長(zhǎng)吁一口氣：“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼浮庐！你這毒婦竟也來(lái)了？” 一聲冷哼從身側(cè)響起柬焕，我...
開封第一講書人閱讀 39,804評(píng)論 0贊 276
萬(wàn)榮殺人案實(shí)錄
序言：老撾萬(wàn)榮一對(duì)情侶失蹤审残，失蹤者是張志新（化名）和其女友劉穎，沒(méi)想到半個(gè)月后斑举，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體搅轿，經(jīng)...
沈念sama閱讀 46,345評(píng)論 1贊 318
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡，尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 38,431評(píng)論 3贊 340
?白月光啟示錄
正文我和宋清朗相戀三年富玷，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了璧坟。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 40,561評(píng)論 1贊 352
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡凌彬，死狀恐怖沸柔，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情铲敛，我是刑警寧澤褐澎，帶...
沈念sama閱讀 36,238評(píng)論 5贊 350
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站伐蒋，受9級(jí)特大地震影響工三，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜先鱼，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,928評(píng)論 3贊 334
男人毒藥：我在死后第九天來(lái)索命
文/蒙蒙一俭正、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧焙畔，春花似錦掸读、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 32,417評(píng)論 0贊 24
一樁弒父案儿惫，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽(yáng)。三九已至伸但，卻和暖如春肾请，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背更胖。一陣腳步聲響...
開封第一講書人閱讀 33,528評(píng)論 1贊 272
情欲美人皮
我被黑心中介騙來(lái)泰國(guó)打工铛铁，沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留隔显，地道東北人。一個(gè)月前我還...
沈念sama閱讀 48,983評(píng)論 3贊 376
代替公主和親
正文我出身青樓饵逐，卻偏偏與公主長(zhǎng)得像括眠，于是被迫代替她去往敵國(guó)和親。傳聞我的和親對(duì)象是個(gè)殘疾皇子倍权，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 45,573評(píng)論 2贊 359