5. Data Exploration#

Now that we have our data, it is very important to familiarize ourselves with it.
In this section, we will look at different ways to explore a dataset. We will also cover some visualization techniques and demonstrate how to write more complex SQL queries to retrieve relationships between data points.
What is Data Exploration?
As the name suggests, Data Exploration literally means exploring the data. Data scientists are often given the advice to “look to the data.” Even though it sounds very basic, you’d be surprised how many people skip this step.
This step is often referred to as Exploratory Data Analysis (EDA). Data scientists perform EDA to become familiar with the dataset, uncover patterns, and generate statistical summaries or visualizations that help represent the data more effectively.
Exploring your data helps you:
Get familiar with the structure and contents
Build intuition about trends and distributions
Spot missing values or inconsistencies
Discover issues requiring cleaning or transformation
Taking the time to explore thoroughly sets the foundation for reliable analysis later.
This chapter is structured as follows: