如果輸出為四個啥箭,那么輸出層的每個神經(jīng)元需要學(xué)習(xí)的是“1和2的手寫體之間的區(qū)別”之類的斷言;
如果輸出為十個爹梁,那么輸出層的每個神經(jīng)元需要學(xué)習(xí)的只是“判斷一幅圖片是不是1”這樣的斷言。
而描述一個圖片是不是某個數(shù)字比描述兩個數(shù)字之間的區(qū)別容易的多提澎。
(問題來自Neural networks and deep learning)
You might wonder why we use 10 output neurons. After all, the goal of the network is to tell us which digit (0,1,2,…,9) corresponds to the input image. A seemingly natural way of doing that is to use just 44 output neurons, treating each neuron as taking on a binary value, depending on whether the neuron's output is closer to 0 or to 1. Four neurons are enough to encode the answer, since 24=16 is more than the 10 possible values for the input digit. Why should our network use 10 neurons instead? Isn't that inefficient? The ultimate justification is empirical: we can try out both network designs, and it turns out that, for this particular problem, the network with 1010output neurons learns to recognize digits better than the network with 4 output neurons. But that leaves us wonderingwhyusing 1010output neurons works better. Is there some heuristic that would tell us in advance that we should use the 10-output encoding instead of the 4-output encoding?
……