Last month, Andrew Ng came to give two lectures, one for public and the other for specific. He talked about directions of recent research on dealing with data. No doubt the hero is deep learning, the new AI method. When he did research in Google and also now in Baidu for BaiduEye, the main tool is deep learning, though they use a much more complex structure as well as hundreds of computers.
If you have heard about neural network, then deep learning could be seen as a combination of neural networks. Most machine learning tools can be approximated by neural networks with one or two hidden layers. Then you can imagine how powerful a deep learning structure can be as a combination of neural networks.
When using neural networks to train data, usually we will first decide how many layers and how many neurons first, and then use this model with coefficient parameters to train given data. And those parameters after training are related to the data. It has been proved to be competent in classification problems.?
However, deep learning is a more complex structure. First it will divide the whole procedure into several main steps, like preprocessing and feature transformations. Then it will treat each main step as a cycle or as a whole and build networks to complete this separate cycle. And the last step is to connect those cycles one by one.
This complex structure is very useful when we want to accomplish similar targets, for example identify human race and identify human age. The first step for these two might be identify human first. Then if we treat identify human as a cycle, then we can share the cycle with other researches.
Also the hidden layers may not be hidden any more. They can also be treated as features for retrieval. For example, when we want to identify a specific person from moving videos, we can recover canonical-view face images from connected pictures and then use this result to identify. The recovering step is a hidden layer in this research but it might play an independent role in other topics and is useful.
Usually, we can also use partial results to train other data. For example, some researchers study medical ultrasound images on pregnancy but because the number of pregnant mom is relatively small than the model. Then how can they do such a research? One researcher used ImageNet database to train his models and then use the model to train real data. Guess what? He got a good result. So if you meet similar questions like less samples, maybe you can try this method.