ID3 and Decision tree
Thứ Năm, 18 tháng 2, 2010
, Posted by Thiên Thần CNTT at 23:03
ID3 algorithm
- Is the algorithm to construct a decision tree
- Using Entropy to generate the information gain
- The best value then be selected
The complete formula for entropy is:
E(S) = -(p+)*log2(p+ ) - (p_ )*log2(p_ )
Where p+ is the positive samples
Where p_ is the negative samples
Where S is the sample of attributions
Information Gain
Gain (Sample, Attributes) or Gain (S,A) is expected reduction in entropy due to sorting S on attribute A
Gain(S,A) = Entropy(S) - vvalues(A) |Sv|/|S| Entropy(Sv)
So, for the previous example, the Information gain is calculated:
G(A1) = E(A1) - (21+5)/(29+35) * E(TRUE)
- (8+30)/(29+35) * E(FALSE)
= E(A1) - 26/64 * E(TRUE) - 38/64* E(FALSE)
= 0.9937– 26/64 * 0.796 – 38/64* 0.7426
= 0.5465
ID3 and Decision Tree by Tuan Nguyen (5/06)
Currently have 0 nhận xét: