In essence, our tree is a set of rules that allows us to make predictions by asking simple yes-or-no questions about each feature. What results is an inverted tree-like structure such as that in Figure 9.1. For classification, predicted probabilities can be obtained using the proportion of each class within the subgroup. After all the partitioning has been done, the model predicts the output based on (1) the average response values for all observations that fall in that subgroup (regression problem), or (2) the class that has majority representation (classification problem). This is done a number of times until a suitable stopping criteria is satisfied (e.g., a maximum depth of the tree is reached). The subgroups (also called nodes) are formed recursively using binary partitions formed by asking simple yes-or-no questions about each feature (e.g., is age < 18?). 26 A basic decision tree partitions the training data into homogeneous subgroups (i.e., groups with similar response values) and then fits a simple constant in each subgroup (e.g., the mean of the within group response values for regression). There are many methodologies for constructing decision trees but the most well-known is the classification and regression tree (CART) algorithm proposed in Breiman ( 1984). 22.2 Measuring probability and uncertainty.21.3.2 Divisive hierarchical clustering.21.3.1 Agglomerative hierarchical clustering.21.2 Hierarchical clustering algorithms.18.4.2 Tuning to optimize for unseen data.17.5.2 Proportion of variance explained criterion.17.5 Selecting the number of principal components.16.8.3 XGBoost and built-in Shapley values.16.7 Local interpretable model-agnostic explanations.16.5 Individual conditional expectation.16.3 Permutation-based feature importance.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |