正確理解TensorFlow中的logits

【問題】I was going through the tensorflow API docs here. In the tensorflow documentation, they used a keyword called logits. What is it? In a lot of methods in the API docs it is written like
我正想通過tensorflow API文檔在這里购对。在tensorflow文檔中,他們使用了一個叫做關(guān)鍵字logits陶因。它是什么骡苞?API文檔中的很多方法都是這樣寫的

tf.nn.softmax(logits, name=None)

If what is written is those logits are only Tensors, why keeping a different name like logits?

Another thing is that there are two methods I could not differentiate. They were
如果寫的是logits只有這些Tensors,為什么要保留一個不同的名字logits楷扬?

另一件事是有兩種方法我不能區(qū)分解幽。他們是

tf.nn.softmax(logits, name=None)
tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)

What are the differences between them? The docs are not clear to me. I know what tf.nn.softmaxdoes. But not the other. An example will be really helpful.
他們之間有什么不同?文檔對我不明確烘苹。我知道是什么tf.nn.softmax躲株。但不是其他。一個例子會非常有用镣衡。
Short version:

Suppose you have two tensors, where y_hat contains computed scores for each class (for example, from y = W*x +b) and y_true contains one-hot encoded true labels.
假設(shè)您有兩個張量霜定,其中y_hat包含每個類的計算得分(例如,從y = W * x + b)廊鸥,并y_true包含一個熱點編碼的真實標簽望浩。

y_hat  = ... # Predicted label, e.g. y = tf.matmul(X, W) + b
y_true = ... # True label, one-hot encoded

If you interpret the scores in y_hat as unnormalized log probabilities, then they are logits.

Additionally, the total cross-entropy loss computed in this manner:
如果您將分數(shù)解釋為y_hat非標準化的日志概率,那么它們就是logits惰说。

另外曾雕,以這種方式計算的總交叉熵損失:

y_hat_softmax = tf.nn.softmax(y_hat)
total_loss = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_hat_softmax), [1]))

本質(zhì)上等價于用函數(shù)計算的總交叉熵損失softmax_cross_entropy_with_logits():
is essentially equivalent to the total cross-entropy loss computed with the function softmax_cross_entropy_with_logits():

total_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true))

Long version:

In the output layer of your neural network, you will probably compute an array that contains the class scores for each of your training instances, such as from a computation y_hat = W*x + b. To serve as an example, below I've created a y_hat as a 2 x 3 array, where the rows correspond to the training instances and the columns correspond to classes. So here there are 2 training instances and 3 classes.
在神經(jīng)網(wǎng)絡(luò)的輸出層中,您可能會計算一個數(shù)組助被,其中包含每個訓(xùn)練實例的類分數(shù)剖张,例如來自計算y_hat = W*x + b。作為一個例子揩环,下面我創(chuàng)建了y_hat一個2×3數(shù)組搔弄,其中行對應(yīng)于訓(xùn)練實例,列對應(yīng)于類丰滑。所以這里有2個訓(xùn)練實例和3個類別顾犹。

import tensorflow as tf
import numpy as np

sess = tf.Session()

# Create example y_hat.
y_hat = tf.convert_to_tensor(np.array([[0.5, 1.5, 0.1],[2.2, 1.3, 1.7]]))
sess.run(y_hat)
# array([[ 0.5,  1.5,  0.1],
#        [ 2.2,  1.3,  1.7]])

Note that the values are not normalized (i.e. the rows don't add up to 1). In order to normalize them, we can apply the softmax function, which interprets the input as unnormalized log probabilities (aka logits) and outputs normalized linear probabilities.
請注意,這些值沒有標準化(即每一行的和不等于1)褒墨。為了對它們進行歸一化炫刷,我們可以應(yīng)用softmax函數(shù),它將輸入解釋為非歸一化對數(shù)概率(又名logits)并輸出歸一化的線性概率郁妈。

y_hat_softmax = tf.nn.softmax(y_hat)
sess.run(y_hat_softmax)
# array([[ 0.227863  ,  0.61939586,  0.15274114],
#        [ 0.49674623,  0.20196195,  0.30129182]])

It's important to fully understand what the softmax output is saying. Below I've shown a table that more clearly represents the output above. It can be seen that, for example, the probability of training instance 1 being "Class 2" is 0.619. The class probabilities for each training instance are normalized, so the sum of each row is 1.0.
充分理解softmax輸出的含義非常重要浑玛。下面我列出了一張更清楚地表示上面輸出的表格∝洌可以看出顾彰,例如极阅,訓(xùn)練實例1為“2類”的概率為0.619。每個訓(xùn)練實例的類概率被歸一化涨享,所以每行的總和為1.0筋搏。

                      Pr(Class 1)  Pr(Class 2)  Pr(Class 3)
                    ,--------------------------------------
Training instance 1 | 0.227863   | 0.61939586 | 0.15274114
Training instance 2 | 0.49674623 | 0.20196195 | 0.30129182

So now we have class probabilities for each training instance, where we can take the argmax() of each row to generate a final classification. From above, we may generate that training instance 1 belongs to "Class 2" and training instance 2 belongs to "Class 1".

Are these classifications correct? We need to measure against the true labels from the training set. You will need a one-hot encoded y_true array, where again the rows are training instances and columns are classes. Below I've created an example y_true one-hot array where the true label for training instance 1 is "Class 2" and the true label for training instance 2 is "Class 3".
所以現(xiàn)在我們有每個訓(xùn)練實例的類概率,我們可以在每個行的argmax()中生成最終的分類厕隧。從上面奔脐,我們可以生成訓(xùn)練實例1屬于“2類”,訓(xùn)練實例2屬于“1類”吁讨。

這些分類是否正確髓迎?我們需要根據(jù)訓(xùn)練集中的真實標簽進行測量。您將需要一個熱點編碼y_true數(shù)組挡爵,其中行又是訓(xùn)練實例竖般,列是類甚垦。下面我創(chuàng)建了一個示例y_trueone-hot數(shù)組茶鹃,其中訓(xùn)練實例1的真實標簽為“Class 2”,訓(xùn)練實例2的真實標簽為“Class 3”艰亮。

y_true = tf.convert_to_tensor(np.array([[0.0, 1.0, 0.0],[0.0, 0.0, 1.0]]))
sess.run(y_true)
# array([[ 0.,  1.,  0.],
#        [ 0.,  0.,  1.]])

Is the probability distribution in y_hat_softmax close to the probability distribution in y_true? We can use cross-entropy loss to measure the error.
概率分布是否y_hat_softmax接近概率分布y_true闭翩?我們可以使用交叉熵損失來衡量錯誤。

We can compute the cross-entropy loss on a row-wise basis and see the results. Below we can see that training instance 1 has a loss of 0.479, while training instance 2 has a higher loss of 1.200. This result makes sense because in our example above, y_hat_softmax showed that training instance 1's highest probability was for "Class 2", which matches training instance 1 in y_true; however, the prediction for training instance 2 showed a highest probability for "Class 1", which does not match the true class "Class 3".
我們可以逐行計算交叉熵損失并查看結(jié)果迄埃。下面我們可以看到疗韵,訓(xùn)練實例1損失了0.479,而訓(xùn)練實例2損失了1.200侄非。這個結(jié)果是有道理的蕉汪,因為在我們上面的例子中y_hat_softmax,訓(xùn)練實例1的最高概率是“類2”逞怨,它與訓(xùn)練實例1匹配y_true; 然而者疤,訓(xùn)練實例2的預(yù)測顯示“1類”的最高概率,其與真實類“3類”不匹配叠赦。

loss_per_instance_1 = -tf.reduce_sum(y_true * tf.log(y_hat_softmax), reduction_indices=[1])
sess.run(loss_per_instance_1)
# array([ 0.4790107 ,  1.19967598])

What we really want is the total loss over all the training instances. So we can compute:
我們真正想要的是所有培訓(xùn)實例的全部損失驹马。所以我們可以計算:

total_loss_1 = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_hat_softmax), reduction_indices=[1]))
sess.run(total_loss_1)
# 0.83934333897877944

Using softmax_cross_entropy_with_logits()

We can instead compute the total cross entropy loss using the tf.nn.softmax_cross_entropy_with_logits() function, as shown below.
使用softmax_cross_entropy_with_logits()

我們可以用tf.nn.softmax_cross_entropy_with_logits()函數(shù)來計算總的交叉熵損失,如下所示除秀。

loss_per_instance_2 = tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true)
sess.run(loss_per_instance_2)
# array([ 0.4790107 ,  1.19967598])

total_loss_2 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true))
sess.run(total_loss_2)
# 0.83934333897877922

Note that total_loss_1 and total_loss_2 produce essentially equivalent results with some small differences in the very final digits. However, you might as well use the second approach: it takes one less line of code and accumulates less numerical error because the softmax is done for you inside of softmax_cross_entropy_with_logits().
請注意糯累,total_loss_1并total_loss_2產(chǎn)生基本相同的結(jié)果,在最后一位數(shù)字中有一些小的差異册踩。但是泳姐,你可以使用第二種方法:它只需要少一行代碼,并累積更少的數(shù)字錯誤暂吉,因為softmax是在你內(nèi)部完成的softmax_cross_entropy_with_logits()仗岸。

form Stack Overflow[https://stackoverflow.com/questions/34240703/what-is-logits-softmax-and-softmax-cross-entropy-with-logits?noredirect=1&lq=1]

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末允耿,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子扒怖,更是在濱河造成了極大的恐慌较锡,老刑警劉巖,帶你破解...
    沈念sama閱讀 206,723評論 6 481
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件盗痒,死亡現(xiàn)場離奇詭異蚂蕴,居然都是意外死亡,警方通過查閱死者的電腦和手機俯邓,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 88,485評論 2 382
  • 文/潘曉璐 我一進店門骡楼,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人稽鞭,你說我怎么就攤上這事鸟整。” “怎么了朦蕴?”我有些...
    開封第一講書人閱讀 152,998評論 0 344
  • 文/不壞的土叔 我叫張陵篮条,是天一觀的道長。 經(jīng)常有香客問我吩抓,道長涉茧,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 55,323評論 1 279
  • 正文 為了忘掉前任疹娶,我火速辦了婚禮伴栓,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘雨饺。我一直安慰自己钳垮,他們只是感情好,可當我...
    茶點故事閱讀 64,355評論 5 374
  • 文/花漫 我一把揭開白布额港。 她就那樣靜靜地躺著饺窿,像睡著了一般。 火紅的嫁衣襯著肌膚如雪锹安。 梳的紋絲不亂的頭發(fā)上短荐,一...
    開封第一講書人閱讀 49,079評論 1 285
  • 那天,我揣著相機與錄音叹哭,去河邊找鬼忍宋。 笑死,一個胖子當著我的面吹牛风罩,可吹牛的內(nèi)容都是我干的糠排。 我是一名探鬼主播,決...
    沈念sama閱讀 38,389評論 3 400
  • 文/蒼蘭香墨 我猛地睜開眼超升,長吁一口氣:“原來是場噩夢啊……” “哼入宦!你這毒婦竟也來了哺徊?” 一聲冷哼從身側(cè)響起,我...
    開封第一講書人閱讀 37,019評論 0 259
  • 序言:老撾萬榮一對情侶失蹤乾闰,失蹤者是張志新(化名)和其女友劉穎落追,沒想到半個月后,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體涯肩,經(jīng)...
    沈念sama閱讀 43,519評論 1 300
  • 正文 獨居荒郊野嶺守林人離奇死亡轿钠,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 35,971評論 2 325
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發(fā)現(xiàn)自己被綠了病苗。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片疗垛。...
    茶點故事閱讀 38,100評論 1 333
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖硫朦,靈堂內(nèi)的尸體忽然破棺而出贷腕,到底是詐尸還是另有隱情,我是刑警寧澤咬展,帶...
    沈念sama閱讀 33,738評論 4 324
  • 正文 年R本政府宣布泽裳,位于F島的核電站,受9級特大地震影響挚赊,放射性物質(zhì)發(fā)生泄漏诡壁。R本人自食惡果不足惜济瓢,卻給世界環(huán)境...
    茶點故事閱讀 39,293評論 3 307
  • 文/蒙蒙 一荠割、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧旺矾,春花似錦蔑鹦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 30,289評論 0 19
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至柬帕,卻和暖如春哟忍,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背陷寝。 一陣腳步聲響...
    開封第一講書人閱讀 31,517評論 1 262
  • 我被黑心中介騙來泰國打工锅很, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人凤跑。 一個月前我還...
    沈念sama閱讀 45,547評論 2 354
  • 正文 我出身青樓爆安,卻偏偏與公主長得像,于是被迫代替她去往敵國和親仔引。 傳聞我的和親對象是個殘疾皇子扔仓,可洞房花燭夜當晚...
    茶點故事閱讀 42,834評論 2 345

推薦閱讀更多精彩內(nèi)容

  • 開源框架 http://blog.csdn.net/quanqinyang/article/details/453...
    NieFeng1024閱讀 169評論 0 0
  • 再度回到單身褐奥。 她不知道怎么關(guān)心我,就像我不知道怎么關(guān)心前任翘簇。 這其實是個無解的問題撬码。 對方難過了,我不關(guān)心版保。 現(xiàn)...
    SandmanLi閱讀 296評論 7 0
  • 對陽朔的印象找筝,一直停留在2002那年的秋天:明媚的天空蹈垢,清新的空氣,西街上熙熙攘攘的人群袖裕,各具特色的餐廳酒吧曹抬,手握...
    Vivi遇見未知世界閱讀 617評論 3 0