What is a decision tree?
Decision trees are amongst the most intuitive and explainable machine learning algorithms.
A tree is built from the existing data to model the various decision points that describe the data. In layman’s terms, it can be seen as a bunch of chained "if-then" statements.
How do decision trees work?
Decision tree algorithms teach the machine to make decisions based on a set of conditions.
Each tree starts with a root node (the topmost node) and consists of internal nodes and leaf nodes.
The root node asks the first question (a parameter to classify an object). Internal nodes have conditions or parameters that help the machine learn more about said object.
Leaf nodes have the final outcomes or results (the decision that the machine makes). The branches represent the outcome of each condition.
Decision tree in action
Here's an example of a decision tree that's learning to classify a person as fit or unfit based on three parameters:
- Age 👶
- Whether the person eats pizza 🍕
- Whether the person exercises in the morning 🏃
This tree has been trained using data from several people who had provided the values for the three parameters mentioned above. They also replied whether or not they were fit.
Now, whenever you get the data from a new person, you can run it down the tree and give them your verdict!
So, if you were a 23-year-old human🙋 who doesn't eat a lot of pizza🍕, you will be classified as a fit person according to this decision tree (yay!).
Unlike nature (where trees grow upwards from the root), the trees in computer science grow upside down.🌳
How can you use decision trees?
Regression models don't work when the relationships are non-linear. Enter the decision tree!🌳
What are ensemble models?
While individual decision trees can be used as predictive models, they're more powerful when a number of them are combined. These are called ensemble models. In simple terms, a bunch of weak learners come together to form a powerful learner💪.
In ensemble models, decision trees serve as base models for a number of more advanced tree-based ensemble machine learning techniques such as bagging and boosting.
Why should you use decision tree-based models?
The major USP of decision trees is the power of interpretation.
Whereas the predictions of the most complicated deep learning algorithms leave practitioners scratching their heads🤔, understanding why a decision tree is predicting what it is predicting is a breeze✌!
This is especially handy when you have to explain your decisions based on the model to a business team.
Think we're missing something? 🧐 Help us update this article by sending us your suggestions here. 🙏