How to frame the right questions to be answered using data

You can’t fully understand your data until you know the right questions to ask of it

Photo by You X Ventures on Unsplash.
  1. Determine the type of problem to be addressed using the data: Depending on your dataset, the problem to be solved could fall under one of the following categories: descriptive analytics, predictive analytics, or prescriptive analytics. These three categories of data science tasks will be discussed in greater details in the case studies section below.
  2. Understand your limitations as a data scientist: A data scientist might not have domain knowledge about the system of interest. For instance, depending on the organization you are working for, you may have to work with a team of engineers (industrial dataset), doctors (healthcare dataset), etc., in order to figure out predictor features and target features to use in your model. For example, an industrial warehouse system may have sensors that generate data in real-time to track operations in the warehouse. In this case, as a data scientist, you may not have technical knowledge about the system in question. So, you would have to work with engineers and technicians for them to guide you to decide what features are of interest and what are the predictor variables and the target variable. Teamwork is, therefore, essential to piece together different aspects of the project. From my personal experience working on an industrial data science project, my team had to work with system engineers, electrical engineers, mechanical engineers, field engineers, and technicians over a period of 3 months just to understand how to frame the right questions to be solved with the available data. Such a multidisciplinary approach to problem-solving is essential in real-world data science projects.

Descriptive Data Analysis

In descriptive analytics, you are interested in studying relationships between features in your dataset. Data visualization plays an essential role here. You need to decide the type of data visualization that is suitable for the project at hand. It could be a scatter plot, barplot, line graph, density plot, heat map, etc. Some examples of data visualization projects are presented below.

Figure 1. 2016 Market share of electric vehicles in selected countries. Image by Benjamin O. Tayo.
Figure 2. Quantity of advertising emails from Best Buy (BBY), Walgreens (WGN) and Walmart (WMT) in 2018. Image by Benjamin O. Tayo.
Figure 3. 2020 Worldwide number of jobs by skill using the LinkedIn search tool. Image by Benjamin O. Tayo.
Figure 4. Covariance matrix plot showing correlation coefficients between features in the dataset. Image source: Benjamin O. Tayo.
Figure 5. Record temperatures for different months between 2005 to 2014. Image by Benjamin O. Tayo.

Predictive Data Analysis

In predictive analysis, the goal is to build a model using available data that can then be used for making predictions on unseen data. Here, the type of model to build will depend on the type of target variable. If the target variable is continuous, then one can use linear regression, and if the target variable is discrete, then classification could be used. The framework for prescriptive data analysis is illustrated in Figure 6 below.

Figure 6. Illustrating the Machine Learning Project Workflow. Image by Benjamin O. Tayo

Prescriptive Data Analysis

Sometimes, the available data may serve only as sample data that can be used for generating more data. The data generated could then be used in prescriptive analysis for recommending a course of action. An example of this is the loan status forecasting problem: Predictive loan status using Monte Carlo simulation.

Physicist, Data Science Educator, Writer. Interests: Data Science, Machine Learning, AI, Python & R, Predictive Analytics, Materials Sciences, Biophysics

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store