# Math for Data Science Beginners: Probability and Statistics

## Mathematics remains a major hindrance for beginners trying to get into data science Probability distribution of male and female heights. Image by Benjamin O. Tayo

Mathematics remains a major hindrance for beginners trying to get into data science. Most beginners interested in getting into the field of data science are always concerned about the math requirements. Data science is a very quantitative field that requires advanced mathematics. But to get started, you only need to master a few math topics. In this series of articles, we will dive deep and discuss the essential math topics that must be reviewed before embarking on a data science journey. The topics to be covered in the series are:

# Statistics and Probability

Statistics and Probability is used for visualization of features, data preprocessing, feature transformation, data imputation, dimensionality reduction, feature engineering, model evaluation, etc. This article will focus on the fundamental Statistics and Probability concepts for beginners in the field, namely: Mean or Expectation Value, Variance and Standard Deviation, Confidence Interval, Central Limit Theorem, Correlation and Covariance, Probability Distribution, and Bayes’ Theorem.

## 1) Mean or Expectation Value

Let X be a random variable with N observations, then the mean value of X is given by

The mean or expectation value is a measure of central tendency.

## 2) Variance and Standard Deviation

Let X be a random variable with N observations, then the variance of X is given by:

The standard deviation is the square root of the variance and is a measure of uncertainty or volatility.

## 3) Confidence Interval

--

--

Dr. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com