Data Visualization in Python

What is data visualization?

Is the practice of visualizing data in graphs, icons, presentations and more. It is most commonly used to translate complex data into digestible insights for a non-technical audience.

What is Matplotlib?

Matplotlib is one of the most powerful tools for data visualization in Python. It tries to make easy things easy and hard things possible. You can generate plots, histograms, power spectra, bar charts, errorcharts, scatterplots, etc., with just a few lines of code.

Installation

You can install Matplotlib from your terminal using:

pip install matplotlib

Importing the library

To get matplotlib up and running in our environment, we need to import it.

import matplotlib.pyplot as plt

Whenever you plot with matplotlib, the two main code lines should be,

  1. Type of graph — this is where you define a bar chart, line chart, etc.
  2. Show the graph — this is to display the graph

Introduction to Seaborn

Seaborn provides a high-level interface to Matplotlib, a powerful but sometimes unwieldy Python visualization library.

On Seaborn’s official website, they state:

If matplotlib “tries to make easy things easy and hard things possible”, seaborn tries to make a well-defined set of hard things easy too.

We’ve found this to be a pretty good summary of Seaborn’s strengths. In practice, the “well-defined set of hard things” includes:

  • Using default themes that are aesthetically pleasing.
  • Setting custom color palettes.
  • Making attractive statistical plots.
  • Easily and flexibly displaying distributions.
  • Visualizing information from matrices and DataFrames.

Installation

pip install seaborn

Importing the library

import seaborn as sns

Different categories of plot in Seaborn

Plots are basically used for visualizing the relationship between variables. Those variables can be either be completely numerical or a category like a group, class or division. Seaborn divides plot into the below categories –

  • Relational plots: This plot is used to understand the relation between two variables.
  • Categorical plots: This plot deals with categorical variables and how they can be visualized.
  • Distribution plots: This plot is used for examining univariate and bivariate distributions
  • Regression plots: The regression plots in seaborn are primarily intended to add a visual guide that helps to emphasize patterns in a dataset during exploratory data analyses.
  • Matrix plots: A matrix plot is an array of scatterplots.
  • Multi-plot grids: It is an useful approach is to draw multiple instances of the same plot on different subsets of the dataset.

Thank you for reading! I would appreciate any comments, notes, corrections, questions or suggestions — if there’s anything you’d like me to write about, please don’t hesitate to let me know.