PRUNING in Decision Trees

Need of Pruning is to reduce overfitting of the Decision tree and make a happy place for test data. Let’s see how we can do this.

Pruning can be done in two ways :

Pre Pruning

  • Minimum no. of sample present in nodes
  • Maximum Depth
  • Maximum no. of nodes
  • Preset Gini Index, Information gain is fixed, which if violated the tree isn’t split further

Post Pruning

  • Reduced Error Pruning :
    1. Holdout some instances from training data
    2. Calculate misclassification for each of holdout set using the decision tree created
    3. Pruning is done if parent node has errors lesser than child node
  • Cost Complexity or Weakest Link Pruning:
  1. After the full grown tree, we make trees out of it by pruning at different levels such that we have tree rolled up to the level of root node also.
  2. We calculate misclassification rate(or Sum of Square residuals for Regression Tree) from test data.
  3. Now that we have misclassification(or SSR) for each tree, we add a penalty term to each tree which is penalty coefficient(alpha) times total number of leaves.

C(T) is Cost Complexity Function parametrized by alpha, R(T) is loss function considered(Misclassification Rate or Sum of Square of Residuals), |T| is no. of leaf nodes.

Reason for adding |T|: The most grown tree will most of the times provide lower R(T) as compared to say tree with only root node but that is simply due to large splits.

4. Our task is to find the tree with least cost-complexity function. We do this iteratively by removing subtree which minimizes the reduction in Cost complexity function each time staring from the fully grown tree..

5. Choose alpha by K- cross validation method. Increase of alpha moves our choice of tree to just root node. Average alpha is the final alpha considered for selecting the desired pruned tree.

Just some additional points

  • Both are Regularization methods in Decision Trees.
  • Pre pruning is faster then Post pruning
  • Pre pruning goes top to bottom, while post pruning goes bottom up approach


  1. Ranji Raj
  2. StatQuest

Problem Solver, Data Science, Actuarial Science, Knowledge Sharer, Hardcore Googler