Decision Tree for Classification

an illustration of a decision tree

After spending some good posts about Support Vector Machine, let us move on to decision tree. This is another analytical model applicable to both classification and regression. Unlike the models we have learned so far, these models do not have an equation to represent their decision making process. Rather, they rely on a flow structure to derive the outcomes from the inputs. Trees are very interesting and are the basis of a number of ensemble models that we will discuss later on. Like SVM, we start with the easier case of decision trees for classification. With that being said, let us dive in!

An example of decision tree

We will start with a toy example to learn about decision trees with the small data below. There are six rows representing six families, and three columns, Family Income, Family Size, and Financial Standing. The task here is to classify whether each families’ financial situation is Good or Bad.

the sample data set

A tree that classifies the given data can be as follows. In a logistic model or a support vector machine, you plug values for each feature into an equation to calculate the labels’ scores. Differently, instances fed into a decision tree follow a flow of nodes until they reach one that provides their predicted labels.

a decision tree for the sample data set

In the tree above, a family to be predicted would enter the tree through the Data in entry. The first node that they encounter checks whether the families’ Income is below $70,000 and redirect the instance to the appropriate direction. If the family indeed makes less than $70,000 annually, it follows the left (True) branch, and otherwise, the right (False) one. The node on the left branch check if the family Size < 2, and the right one check Size < 4. Depending on the outcome at either check, the family continues traversing to the next and final node that makes a prediction on whether their financial situation is Good or Bad. Below are two examples about making predictions for two instances with the tree model.

an illustration of making prediction for an instance

More about decision trees

Node types

There are different types of nodes in a decision tree:

  • Leaf nodes are those that assign labels to data instances that reach them. These are terminal nodes in that there are no flows coming out of them. In my visualizations, leaf nodes are those with rounded corners.
  • Internal nodes perform checks on specific features of the instances and redirect their flows. For example, when an instance reaches the node Size < 2, if its Size is indeed below 2, the instance follow the True flow, otherwise, it follows the False flow. The internal nodes are rectangles in my visualizations.
  • Root nodes are the first nodes of the trees in terms of data flow. All instances in the data must go through the tree root before reaching any other nodes. Each tree has a single root node.
an illustration of different node types

Nodes and splits

We have been discussing decision trees as if they assign labels for individual instances. In practice, trees work on the given data set as a whole. We further refer to the internal nodes as splits since each of them divide the input data into smaller portions. Below is an example of the data coming into each nodes in the tree. The root node takes the complete data set and divide it into a portion with Income < 70000 and one with Income ≥ 70000. The portion with income below 70k then goes through the next split on Size < 2 and end up at the leaf nodes for this branch. Similarly, data with income above 70k reaches the split on Size < 4 then leaf nodes for their predictions.

splits in a tree

Tree and uniqueness

Decision tree solutions are not unique. This means that for the same data, we can have multiple tree structures that results in similar or the same performances. Below is a different tree model for our toy data that is also able to reach perfect classification.

a different tree for the toy example

For this reason, tree models are more random than other models that we discussed. You will get different performances from different runs on the same data. In the next posts, we will discuss ways to address this issue.

Conclusion

I hope this post is helpful for you to understand more about decision trees. Again, they are very interesting models, and also are the bases of many other powerful models that we will talk about later on. In the next post, we will get into hands-on with trees in Python and SKLearn.

2 Comments

Comments are closed