Supervised learning 有監(jiān)督學(xué)習(xí)
Goal: To learn a classification model from the data that can be used to predict the classes of new cases.
A Decision Tree 決策樹概念
A decision tree will include decision nodes and leaf nodes.
All current tree algorithms are all heuristic algorithms
Each path from the root to a leaf is a rule
A greedy Divide-n-conquer algorithm
Tree is constructed in a top-down recursive manner
Key: Which attribute to choose in order to branch
Objective: Reduce impurity or uncertainty in data
手動(dòng)畫決策樹步驟公式
The Entropy Formula:
The Entropy of Attribute Ai:
The Information gained by selecting Ai to branch or to partition data:
Finally we choose the largest gain to split the the current tree
在求出擁有最大InformationGain的Attribute之后,將其作為root。 剩下的數(shù)據(jù)重復(fù)以上過(guò)程。
Quiz related:
1. The resulting decision tree will use a subset of the attributes in S
2. It's a recursive algorithm
3. It works in a depth-first fashion
4. It's complexity is nlog(n)