Math for Data Science Beginners: Linear Regression

Mathematics remains a major hindrance for beginners trying to get into data science

Benjamin Obi Tayo Ph.D.

--

Most beginners interested in getting into the field of data science are always concerned about the math requirements. Data science is a very quantitative field that requires advanced mathematics. But to get started, you only need to master a few math topics. In this series of articles, we will dive deep and discuss the essential math topics that must be reviewed before embarking on a data science journey. The topics to be covered in the series are:

Other articles in the series:

Linear Regression (Continuous Target Variable)

Regression models are the most popular machine learning models. Regression models are used for predicting target variables on a continuous scale. Regression models find applications in almost every field of study, and as a result, it is one of the most widely used machine learning models. This article will discuss the basics of linear regression and is intended for beginners in the field of data science.

Using the cruise ship dataset cruise_ship_info.csv, we will demonstrate simple and multiple regression analysis using NumPy, Pylab, and Scikit-learn. Because this is just an introductory tutorial, no distinction between inliers and outliers shall be made (outliers can be handled using more robust methods such as the RANSAC regression).

Data Analysis and Exploration

--

--

Benjamin Obi Tayo Ph.D.

Dr. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com