Data Visualization — Scatter Plot

Learn the basics of data visualization using scatter plots

Image by Author

About the Author

Benjamin O. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com

Dr. Tayo has written close to 300 articles and tutorials in data science for educating the general public. Support Dr. Tayo’s educational mission using the links below:

PayPal: https://www.paypal.me/BenjaminTayo

CashApp: https://cash.app/$BenjaminTayo

INTRODUCTION

After reading this article, the reader will learn the following:

  • Define a scatter plot
  • Generate a scatter plot using python
  • Interpret a scatter plot

To learn about data visualization using line plots, please see the link below:

A scatter plot is one of the most useful types of data visualization used in data science and machine learning. A scatter plot is a simple two-dimensional plot with the x-axis representing the independent variable, and the y-axis representing the dependent variable.

From a scatter plot, one can determine if there is a functional relationship between the independent variable x, and the dependent variable y. For example, if y increases as x increases, then x and y are said to be positively correlated.

Python Implementation of scatter plot

As an illustration, we will generate a scatter plot for the stock prices of some technology companies.

# import necessary libraries

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

# read dataset

url = 'https://raw.githubusercontent.com/bot13956/datasets/master/tech-stocks-04-2021.csv'

data = pd.read_csv(url)

# example 1: scatter plot for Apple and Tesla stock prices

x =…
Benjamin Obi Tayo Ph.D.

Dr. Tayo is a data science educator, tutor, coach, mentor, and consultant. Contact me for more information about our services and pricing: benjaminobi@gmail.com