Matplotlib: Beyond the basics

Hopefully after this notebook you will:

  • Know how to polish matplotlib figures to the point where they can go to a journal.
  • Understand matplotlib's internal model enough to:
    • know where to look for knobs to fine-tune
    • better understand the help and examples online
    • use it as a development platform for complex visualization


  • A detailed tutorial by Nicolas Rougier, similar in style to the ones we saw for Numpy.
  • The fantastic Python Graph Gallery, which provides a large collection of plots with emphasis on statistical visualizations. It uses Seaborn extensively.
  • In this tutorial we'll focus on "raw" matplotlib, but for a wide variety of statistical visualization tasks, using Seaborn makes life much easier. We'll dive into its tutorial later on.

Matplotlib's main APIs: pyplot and object-oriented

Matplotlib is a library that can be thought of as having two main ways of being used:

  • via pyplot calls, as a high-level, matlab-like library that automatically manages details like figure creation.

  • via its internal object-oriented structure, that offers full control over all aspects of the figure, at the cost of slightly more verbose calls for the common case.

The pyplot api:

  • Easiest to use.
  • Sufficient for simple and moderately complex plots.
  • Does not offer complete control over all details.

Before we look at our first simple example, we must activate matplotlib support in the notebook:

In [1]:
%matplotlib inline

import matplotlib.pyplot as plt
import numpy as np
# a few widely used tools from numpy
from numpy import sin, cos, exp, sqrt, pi, linspace, arange
In [2]:
x = linspace(0, 2 * pi)
y = sin(x)
plt.plot(x, y, label='sin(x)')

# Add one line to that plot
z = cos(x)
plt.plot(x, z, label='cos(x)')

# Make a second figure with a simple plot
plt.plot(x, sin(2*x), label='sin(2x)')

Here is how to create the same two plots, using explicit management of the figure and axis objects:

In [3]:
f, ax = plt.subplots()  # we manually make a figure and axis
ax.plot(x,y, label='sin(x)')  # it's the axis who plots
ax.set_title('Harmonic')  # we set the title on the axis
ax.set_xlabel('x')  # same with labels

# Make a second figure with a simple plot.  We can name the figure with a
# different variable name as well as its axes, and then control each
f1, ax1 = plt.subplots()
ax1.plot(x, sin(2*x), label='sin(2x)')

# Since we now have variables for each axis, we can add back to the first
# figure even after making the second
ax.plot(x, z, label='cos(x)');

It’s important to understand the existence of these objects, even if you use mostly the top-level pyplot calls most of the time. Many things can be accomplished in MPL with mostly pyplot and a little bit of tweaking of the underlying objects. We’ll revisit the object-oriented API later.

Important commands to know about, and which matplotlib uses internally a lot:

gcf()  # get current figure
gca()  # get current axis

Making subplots

The simplest command is:

f, ax = plt.subplots()

which is equivalent to:

f = plt.figure()
ax = f.add_subplot(111)

By passing arguments to subplots, you can easily create a regular plot grid:

In [4]:
x = linspace(0, 2*pi, 400)
y = sin(x**2)

# Just a figure and one subplot
f, ax = plt.subplots()
ax.plot(x, y)
ax.set_title('Simple plot')

# Two subplots, unpack the output array immediately
f, (ax1, ax2) = plt.subplots(1, 2)
ax1.plot(x, y)
ax2.scatter(x, y)

# Put a figure-level title
f.suptitle('Two plots');

And finally, an arbitrarily complex grid can be made with subplot2grid:

In [5]:
f = plt.figure()
ax1 = plt.subplot2grid((3,3), (0,0), colspan=3)
ax2 = plt.subplot2grid((3,3), (1,0), colspan=2)
ax3 = plt.subplot2grid((3,3), (1, 2), rowspan=2)
ax4 = plt.subplot2grid((3,3), (2, 0))
ax5 = plt.subplot2grid((3,3), (2, 1))

# Let's turn off visibility of all tick labels here
for ax in f.axes:
   for t in ax.get_xticklabels()+ax.get_yticklabels():

# And add a figure-level title at the top

# Plot something at the bottom right
ax3.plot([1, 2, 3])
[<matplotlib.lines.Line2D at 0x1135ec2e8>]

Manipulating properties across matplotlib

In matplotlib, most properties for lines, colors, etc, can be set directly in the call:

In [6]:
plt.plot([1,2,3], linestyle='--', color='r')
[<matplotlib.lines.Line2D at 0x1136ed978>]

But for finer control you can get a hold of the returned line object (more on these objects later):

In [1]: line, = plot([1,2,3])

These line objects have a lot of properties you can control, a full list is seen here by tab-completing in IPython:

In [2]: line.set
line.set                     line.set_drawstyle           line.set_mec
line.set_aa                  line.set_figure              line.set_mew
line.set_agg_filter          line.set_fillstyle           line.set_mfc
line.set_alpha               line.set_gid                 line.set_mfcalt
line.set_animated            line.set_label               line.set_ms
line.set_antialiased         line.set_linestyle           line.set_picker
line.set_axes                line.set_linewidth           line.set_pickradius
line.set_c                   line.set_lod                 line.set_rasterized
line.set_clip_box            line.set_ls                  line.set_snap
line.set_clip_on             line.set_lw                  line.set_solid_capstyle
line.set_clip_path           line.set_marker              line.set_solid_joinstyle
line.set_color               line.set_markeredgecolor     line.set_transform
line.set_contains            line.set_markeredgewidth     line.set_url
line.set_dash_capstyle       line.set_markerfacecolor     line.set_visible
line.set_dashes              line.set_markerfacecoloralt  line.set_xdata
line.set_dash_joinstyle      line.set_markersize          line.set_ydata
line.set_data                line.set_markevery           line.set_zorder

But the setp call (short for set property) can be very useful, especially while working interactively because it contains introspection support, so you can learn about the valid calls as you work:

In [7]: line, = plot([1,2,3])

In [8]: setp(line, 'linestyle')
  linestyle: [ ``'-'`` | ``'--'`` | ``'-.'`` | ``':'`` | ``'None'`` | ``' '`` | ``''`` ]         and any drawstyle in combination with a linestyle, e.g. ``'steps--'``.         

In [9]: setp(line)
  agg_filter: unknown
  alpha: float (0.0 transparent through 1.0 opaque)         
  animated: [True | False]         
  antialiased or aa: [True | False]
  ... much more output elided

In the first form, it shows you the valid values for the 'linestyle' property, and in the second it shows you all the acceptable properties you can set on the line object. This makes it very easy to discover how to customize your figures to get the visual results you need.

Furthermore, setp can manipulate multiple objects at a time:

In [7]:
x = linspace(0, 2*pi)
y1 = sin(x)
y2 = sin(2*x)
lines = plt.plot(x, y1, x, y2)

# We will set the width and color of all lines in the figure at once:
plt.setp(lines, linewidth=2, color='r')
[None, None, None, None]

Finally, if you know what properties you want to set on a specific object, a plain set call is typically the simplest form:

In [8]:
line, = plt.plot([1,2,3])
line.set(lw=2, c='red',ls='--')
[None, None, None]

Understanding what matplotlib returns: lines, axes and figures


In a simple plot:

In [9]:
[<matplotlib.lines.Line2D at 0x1139c15c0>]

The return value of the plot call is a list of lines, which can be manipulated further. If you capture the line object (in this case it's a single line so we use a one-element tuple):

In [10]:
line, = plt.plot([1,2,3])

One line property that is particularly useful to be aware of is set_data:

In [11]:
# Create a plot and hold the line object
line, = plt.plot([1,2,3], label='my data')
plt.title('My title')

# ... later, we may want to modify the x/y data but keeping the rest of the
# figure intact, with our new data:
x = linspace(0, 1)
y = x**2

# This can be done by operating on the data object itself
line.set_data(x, y)

# Now we must set the axis limits manually. Note that we can also use xlim
# and ylim to set the x/y limits separately.

# Note, alternatively this can be done with:
ax = plt.gca()  # get currently active axis object

# as well as requesting matplotlib to draw

The next important component, axes

The axis call above was used to set the x/y limits of the axis. And in previous examples we called .plot directly on axis objects. Axes are the main object that contains a lot of the user-facing functionality of matplotlib:

In [15]: f = plt.figure()

In [16]: ax = f.add_subplot(111)

In [17]: ax.
Display all 299 possibilities? (y or n)
ax.acorr                                 ax.hitlist
ax.add_artist                            ax.hlines
ax.add_callback                          ax.hold
ax.add_collection                        ax.ignore_existing_data_limits
ax.add_line                              ax.images
ax.add_patch                             ax.imshow

... etc.

Many of the commands in plt.<command> are nothing but wrappers around axis calls, with machinery to automatically create a figure and add an axis to it if there wasn't one to begin with. The output of most axis actions that draw something is a collection of lines (or other more complex geometric objects).

Enclosing it all, the figure

The enclosing object is the figure, that holds all axes:

In [17]: f = plt.figure()

In [18]: f.add_subplot(211)
Out[18]: <matplotlib.axes.AxesSubplot object at 0x9d0060c>

In [19]: f.axes
Out[19]: [<matplotlib.axes.AxesSubplot object at 0x9d0060c>]

In [20]: f.add_subplot(212)
Out[20]: <matplotlib.axes.AxesSubplot object at 0x9eacf0c>

In [21]: f.axes
[<matplotlib.axes.AxesSubplot object at 0x9d0060c>,
 <matplotlib.axes.AxesSubplot object at 0x9eacf0c>]

The basic view of matplotlib is: a figure contains one or more axes, axes draw and return collections of one or more geometric objects (lines, patches, etc).

For all the gory details on this topic, see the matplotlib artist tutorial.

Anatomy of a common plot

Let's make a simple plot that contains a few commonly used decorations

In [12]:
f, ax = plt.subplots()

# Three simple polyniomials
x = linspace(-1, 1)
y1,y2,y3 = [x**i for i in [1,2,3]]

# Plot each with a label (for a legend)
ax.plot(x, y1, label='linear')
ax.plot(x, y2, label='cuadratic')
ax.plot(x, y3, label='cubic')
# Make all lines drawn so far thicker
plt.setp(ax.lines, linewidth=2)

# Add a grid and a legend that doesn't overlap the lines
ax.legend(loc='lower right')

# Add black horizontal and vertical lines through the origin
ax.axhline(0, color='black')
ax.axvline(0, color='black')

# Set main text elements of the plot
ax.set_title('Some polynomials')

Common plot types

Error plots

First a very simple error plot

In [13]:
# example data
x = arange(0.1, 4, 0.5)
y = exp(-x)

# example variable error bar values
yerr = 0.1 + 0.2*sqrt(x)
xerr = 0.1 + yerr

# First illustrate basic pyplot interface, using defaults where possible.
plt.errorbar(x, y, xerr=0.2, yerr=0.4)
plt.title("Simplest errorbars, 0.2 in x, 0.4 in y")
Text(0.5,1,'Simplest errorbars, 0.2 in x, 0.4 in y')

Now a more elaborate one, using the OO interface to exercise more features.

In [14]:
# same data/errors as before
x = arange(0.1, 4, 0.5)
y = exp(-x)
yerr = 0.1 + 0.2*sqrt(x)
xerr = 0.1 + yerr

fig, axs = plt.subplots(nrows=2, ncols=2)
ax = axs[0,0]
ax.errorbar(x, y, yerr=yerr, fmt='o')
ax.set_title('Vert. symmetric')

# With 4 subplots, reduce the number of axis ticks to avoid crowding.

ax = axs[0,1]
ax.errorbar(x, y, xerr=xerr, fmt='o')
ax.set_title('Hor. symmetric')

ax = axs[1,0]
ax.errorbar(x, y, yerr=[yerr, 2*yerr], xerr=[xerr, 2*xerr], fmt='--o', label='foo')
ax.set_title('H, V asymmetric')

ax = axs[1,1]
# Here we have to be careful to keep all y values positive:
ylower = np.maximum(1e-2, y - yerr)
yerr_lower = y - ylower

ax.errorbar(x, y, yerr=[yerr_lower, 2*yerr], xerr=xerr,
                             fmt='o', ecolor='g')
ax.set_title('Mixed sym., log y')

# Fix layout to minimize overlap between titles and marks 

Logarithmic plots

A simple log plot

In [15]:
x = linspace(-5, 5)
y = exp(-x**2)

f, (ax1, ax2) = plt.subplots(2, 1)
ax1.plot(x, y)
ax2.semilogy(x, y)
[<matplotlib.lines.Line2D at 0x1144b9208>]

A more elaborate log plot using 'symlog', that treats a specified range as linear (thus handling values near zero) and symmetrizes negative values:

In [16]:
x = linspace(-50, 50, 100)
y = linspace(0, 100, 100)

# Create the figure and axes
f, (ax1, ax2, ax3) = plt.subplots(3, 1)

# Symlog on the x axis
ax1.plot(x, y)
# Grid for both axes
# Minor grid on too for x
ax1.xaxis.grid(True, which='minor')

# Symlog on the y axis
ax2.plot(y, x)

# Symlog on both
ax3.plot(x, sin(x / 3.0))
ax3.set_ylabel('symlog both')

Bar plots

In [17]:
N = 5
catMeans = (20, 35, 30, 31, 27)
catStd =   (2, 3, 4, 1, 2)

ind = arange(N)  # the x locations for the groups
width = 0.35       # the width of the bars

fig, ax = plt.subplots()
rects1 =, catMeans, width, color='r', yerr=catStd, label='Cats')

dogMeans = (25, 32, 34, 21, 29)
dogStd =   (3, 5, 2, 3, 3)
rects2 =, dogMeans, width, color='y', yerr=dogStd, label='Dogs')

# add some
ax.set_title('Scores by group and species')
ax.set_xticklabels( ('G1', 'G2', 'G3', 'G4', 'G5') )

Scatter plots

The scatter command produces scatter plots with arbitrary markers.

In [18]:
from matplotlib import cm

t = linspace(0.0, 6*pi, 100)
y = exp(-0.1*t)*cos(t)
phase = t % 2*pi
f, ax = plt.subplots()
ax.scatter(t, y, s=100*abs(y), c=phase, cmap=cm.viridis)
ax.axhline(0, color='k');


Matplotlib has a built-in command for histograms.

In [19]:
# Some normally-distributed data
mu, sigma = 60, 10
x = np.random.normal(mu, sigma, 10000)

# the histogram of the data
n, bins, patches = plt.hist(x, bins=50, normed=True, facecolor='g', alpha=0.75)

plt.title('Histogram of Test Scores')
plt.text(75, .032, rf'$\mu={mu},\ \sigma={sigma}$')

Aribitrary text and LaTeX support

In matplotlib, text can be added either relative to an individual axis object or to the whole figure.

These commands add text to the Axes:

  • title() - add a title
  • xlabel() - add an axis label to the x-axis
  • ylabel() - add an axis label to the y-axis
  • text() - add text at an arbitrary location
  • annotate() - add an annotation, with optional arrow

And these act on the whole figure:

  • figtext() - add text at an arbitrary location
  • suptitle() - add a title

And any text field can contain LaTeX expressions for mathematics, as long as they are enclosed in $ signs.

This example illustrates all of them:

In [20]:
fig = plt.figure()
fig.suptitle('bold figure suptitle', fontsize=14, fontweight='bold')

ax = fig.add_subplot(111)
ax.set_title('axes title')


ax.text(3, 8, 'boxed italics text in data coords', style='italic',
        bbox={'facecolor':'red', 'alpha':0.5, 'pad':10})

ax.text(2, 6, r'an equation: $E=mc^2$', fontsize=15)

ax.text(3, 2, 'unicode: Institut für Festkörperphysik')

ax.text(0.95, 0.01, 'colored text in axes coords',
        verticalalignment='bottom', horizontalalignment='right',
        color='green', fontsize=15)

ax.plot([2], [1], 'o')
ax.annotate('annotate', xy=(2, 1), xytext=(3, 4),
            arrowprops=dict(facecolor='black', shrink=0.05))

ax.axis([0, 10, 0, 10])
[0, 10, 0, 10]

Statistical plotting - some builtin capabilities

Some statistically-oriented plots to visualize data distributions: boxplots and violin plots (this paper by Hadley Wickham is a good overview of boxplots).

Note that often Seaborn will have a simpler API for rich statistical plots, atop matplotlib's engine. This shows how to do plots of this type without seaborn:

In [21]:
# Random test data
all_data = [np.random.normal(0, std, 100) for std in range(1, 4)]

fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(11, 5))

# Box plots
bplots = []
for ax, notch in zip(axes[:2], (False, True)):
    b = ax.boxplot(all_data,
                   vert=True,   # vertical box aligmnent
                   patch_artist=True)   # fill with color

axes[0].set_title('box plot')
axes[1].set_title('notched box plot')

# Violin plot
vplot = axes[2].violinplot(all_data,
axes[2].set_title('violin plot')

# fill with colors
colors = ['pink', 'lightblue', 'lightgreen']
for bplot in bplots:
    for patch, color in zip(bplot['boxes'], colors):

# adding horizontal grid lines
for i, ax in enumerate(axes):
    ax.set_xticks([y+1 for y in range(len(all_data))], )
    if i: ax.set_yticklabels([])

# add x-tick labels
plt.setp(axes, xticks=[y+1 for y in range(len(all_data))],
         xticklabels=['x1', 'x2', 'x3']);