Member-only story

Math for Data Science Beginners: Probability and Statistics

Mathematics remains a major hindrance for beginners trying to get into data science

6 min readAug 17, 2021

Probability distribution of male and female heights. Image by Benjamin O. Tayo

Mathematics remains a major hindrance for beginners trying to get into data science. Most beginners interested in getting into the field of data science are always concerned about the math requirements. Data science is a very quantitative field that requires advanced mathematics. But to get started, you only need to master a few math topics. In this series of articles, we will dive deep and discuss the essential math topics that must be reviewed before embarking on a data science journey. The topics to be covered in the series are:

This article will focus on Statistics and Probability. Please see links above for other articles in the series.

Statistics and Probability

Statistics and Probability is used for visualization of features, data preprocessing, feature transformation, data imputation, dimensionality reduction, feature engineering, model evaluation, etc. This article will focus on the fundamental Statistics and Probability concepts for beginners in the field, namely: Mean or Expectation Value, Variance and Standard Deviation, Confidence Interval, Central Limit Theorem, Correlation and Covariance, Probability Distribution, and Bayes’ Theorem.

1) Mean or Expectation Value

Let X be a random variable with N observations, then the mean value of X is given by

The mean or expectation value is a measure of central tendency.

2) Variance and Standard Deviation

Let X be a random variable with N observations, then the variance of X is given by:

The standard deviation is the square root of the variance and is a measure of uncertainty or volatility.

Math for Data Science Beginners: Probability and Statistics

Mathematics remains a major hindrance for beginners trying to get into data science

Statistics and Probability

1) Mean or Expectation Value

2) Variance and Standard Deviation

3) Confidence Interval

Written by Benjamin Obi Tayo Ph.D.

No responses yet