example
what I mean by measured accuracy is the percentage of correctly predicted outcomes of a node within my decision tree output.
Here is an example of a node I have:
Over past 4 years (1860 games), a situation has occurred 20 times with 18 of them resulting in a positive outcome (ie. correctly predicted victory). This gives an obvious predicted outcome % of 90%.
From my understanding (please correct me if i am wrong), if the N>30 in a classification tree node where the class variable has 2 values (ie win/loss), then the probability of the class variable outcome can be safely assumed to be a normal distribution (if the known distribution of variables is also normal). But, if N<30, then normality can not be assumed and predicting confidence intervals is more complicated.
By the way, the tree is built with 15 continous variables which are based on team rating, offensive/defensive scoring, and SOS. I have evaluated the variables extensively and they are all normally distributed.
|