Member-only story

5 Minutes Tutorial on How to Compute and Visualize the Covariance Matrix

Tutorial on how to compute and visualize the covariance matrix using python seaborn package

Benjamin Obi Tayo Ph.D.
3 min readDec 21, 2021
Image by author.

The covariance matrix is one of the most important matrix in data science and machine learning. The covariance matrix gives the correlation coefficients between features in a dataset and hence it is very useful for feature selection and dimensionality reduction. Plotting the covariance matrix produces a visual plot that displays the correlation coefficients.

In this tutorial, we illustrate how the covariance matrix can be computed and visualized using the cruise ship dataset cruise_ship_info.csv. We also demonstrate how the resultant covariance matrix plot can then be used for feature selection and dimensionality reduction.

Suppose we want to build a regression model to predict cruise ship crew size based on the following features: [‘age’, ‘tonnage’, ‘passengers’, ‘length’, ‘cabins’, ‘passenger_density’]. Our model can be expressed as:

where X is the feature matrix, and w the weights to be learned during training. The question we would like to address is the following:

Out of the 6 features [‘age’, ‘tonnage’, ‘passengers’, ‘length’, ‘cabins’, ‘passenger_density’], which…

--

--

Benjamin Obi Tayo Ph.D.
Benjamin Obi Tayo Ph.D.

Written by Benjamin Obi Tayo Ph.D.

Dr. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com

No responses yet