How to Balance Simplicity and Complexity in Machine Learning

A tutorial on bias-variance tradeoff

Benjamin Obi Tayo Ph.D.
5 min readDec 11, 2021
Photo by Vicky Sim on Unsplash

In statistics and machine learning, the bias-variance tradeoff is the property of a set of predictive models whereby models with a lower bias in parameter estimation have a higher variance of the parameter estimates across samples and vice versa. The bias-variance dilemma or problem is the conflict in trying to simultaneously minimize these two sources of error that prevent supervised learning algorithms from generalizing beyond their training set:

  • The bias is an error from erroneous assumptions in the learning algorithm. High bias (high simplicity) can cause an algorithm to miss the relevant relations between features and target outputs (underfitting).
  • The variance is an error from sensitivity to small fluctuations in the training set. High variance (high complexity) can cause an algorithm to model the random noise in the training data, rather than the intended outputs (overfitting).
Figure 1. Illustrating the bias-variance problem. Image from author.

3 Reasons why a simple model is preferred over a complex model

  1. Prevents Overfitting: A…

--

--

Benjamin Obi Tayo Ph.D.

Dr. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com