Essential Statistics for Data Science

Learn basic statistical concepts used in data science and machine learning

Benjamin Obi Tayo Ph.D.
4 min readOct 20, 2023
Image by unsplash

About the Author

Benjamin O. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com

Dr. Tayo has written close to 300 articles and tutorials in data science for educating the general public. Support Dr. Tayo’s educational mission using the links below:

PayPal: https://www.paypal.me/BenjaminTayo

CashApp: https://cash.app/$BenjaminTayo

INTRODUCTION

Statistical concepts are used widely to extract useful information from data. This article will review essential statistical concepts applicable in data science and machine learning.

Probability Distribution

A probability distribution shows how feature values are distributed around the mean value. Using the iris dataset, the probability distributions for the sepal length, sepal width, petal length, and petal width can be generated using the code below.

import numpy as np

import matplotlib.pyplot as plt

from sklearn import datasets

import seaborn as…

--

--

Benjamin Obi Tayo Ph.D.

Dr. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com