Combining Matplotlib and Pandas¶
The matplotlib framework integrates well with the popular pandas data processing library.
Success
If you would like to learn more about pandas, check out our other workshop!
For the following examples we will use our usual data:
months = [
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
]
water_levels_2010 = [
5.77, 6.04, 6.52, 6.48, 6.54, 5.92,
5.64, 5.21, 5.01, 5.18, 5.45, 5.59
]
water_levels_2020 = [
5.48, 5.82, 6.31, 6.26, 6.09, 5.87,
5.72, 5.54, 5.22, 4.86, 5.12, 5.40
]
Plotting Pandas Series¶
To plot a pandas series, it can be directly fed into the pyplot.plot(…)
-function.
from matplotlib import pyplot
from pandas import Series
# … Raw data as above
# We turn it into a series
measurements = Series(
data = water_levels_2010,
index=months,
name="Water levels in 2010"
)
# And plot it
pyplot.plot(measurements, label=measurements.name) # (1)
pyplot.legend()
pyplot.show()
Explanation
- Note how we can make use of the fact that the series has a name to also assign it as a plot label.
Working with DataFrames¶
Matplotlib can also handle pandas DataFrame
s as input:
from pandas import DataFrame
from matplotlib import pyplot
# … Raw data as above
measurements = DataFrame(
data = {
"Water levels in 2010": water_levels_2010,
"Water levels in 2020": water_levels_2020
},
index = months
)
pyplot.plot(measurements)
pyplot.legend(measurements.columns.values)
pyplot.show()
Data frames in pandas actually also bring a plotting function with them, so you could also write it this way:
Isn’t that convenient?
You can find more information on the DataFrame.plot
documentation page