# Data Scaling for Beginners

## How to scale your data to render it suitable for model building

--

Benjamin O. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com

Dr. Tayo has written close to 300 articles and tutorials in data science for educating the general public. Support Dr. Tayo’s educational mission using the links below:

CashApp: https://cash.app/\$BenjaminTayo

# INTRODUCTION

In the machine learning process, data scaling falls under data preprocessing, or feature engineering. Scaling your data before using it for model building can accomplish the following:

• Scaling ensures that features have values in the same range
• Scaling ensures that the features used in model building are dimensionless
• Scaling can be used for detecting outliers

There are several methods for scaling data. The two most important scaling techniques are Normalization and Standardization.

# Data Scaling Using Normalization

When data is scaled using normalization, the transformed data can be calculated using this equation

where Xmin and Xmax are the minimum and maximum values of the data, respectfully. The scaled data obtained is in the range [0, 1].

# Python Implementation of Normalization

Scaling using normalization can be implemented in Python using the code below:

`from sklearn.preprocessing import Normalizernorm = Normalizer()X_norm = norm.fit_transform(data)`

Let X be a given data with Xmin = 17.7 and Xmax = 71.4. The data X is shown in the figure below:

The normalized X is shown in the figure below:

--

--

Dr. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com