W3cubDocs

/pandas 0.25

Visualization

We use the standard convention for referencing the matplotlib API:

In [1]: import matplotlib.pyplot as plt

In [2]: plt.close('all')

We provide the basics in pandas to easily create decent looking plots. See the ecosystem section for visualization libraries that go beyond the basics documented here.

Note

All calls to np.random are seeded with 123456.

Basic plotting: plot

We will demonstrate the basics, see the cookbook for some advanced strategies.

The plot method on Series and DataFrame is just a simple wrapper around plt.plot():

In [3]: ts = pd.Series(np.random.randn(1000),
   ...:                index=pd.date_range('1/1/2000', periods=1000))
   ...: 

In [4]: ts = ts.cumsum()

In [5]: ts.plot()
Out[5]: <matplotlib.axes._subplots.AxesSubplot at 0x7f65d8c0ac50>
../_images/series_plot_basic.png

If the index consists of dates, it calls gcf().autofmt_xdate() to try to format the x-axis nicely as per above.

On DataFrame, plot() is a convenience to plot all of the columns with labels:

In [6]: df = pd.DataFrame(np.random.randn(1000, 4),
   ...:                   index=ts.index, columns=list('ABCD'))
   ...: 

In [7]: df = df.cumsum()

In [8]: plt.figure();

In [9]: df.plot();
../_images/frame_plot_basic.png

You can plot one column versus another using the x and y keywords in plot():

In [10]: df3 = pd.DataFrame(np.random.randn(1000, 2), columns=['B', 'C']).cumsum()

In [11]: df3['A'] = pd.Series(list(range(len(df))))

In [12]: df3.plot(x='A', y='B')
Out[12]: <matplotlib.axes._subplots.AxesSubplot at 0x7f65d97c1668>
../_images/df_plot_xy.png

Note

For more formatting and styling options, see formatting below.

Other plots

Plotting methods allow for a handful of plot styles other than the default line plot. These methods can be provided as the kind keyword argument to plot(), and include:

For example, a bar plot can be created the following way:

In [13]: plt.figure();

In [14]: df.iloc[5].plot(kind='bar');
../_images/bar_plot_ex.png

You can also create these other plots using the methods DataFrame.plot.<kind> instead of providing the kind keyword argument. This makes it easier to discover plot methods and the specific arguments they use:

In [15]: df = pd.DataFrame()

In [16]: df.plot.<TAB>  # noqa: E225, E999
df.plot.area     df.plot.barh     df.plot.density  df.plot.hist     df.plot.line     df.plot.scatter
df.plot.bar      df.plot.box      df.plot.hexbin   df.plot.kde      df.plot.pie

In addition to these kind s, there are the DataFrame.hist(), and DataFrame.boxplot() methods, which use a separate interface.

Finally, there are several plotting functions in pandas.plotting that take a Series or DataFrame as an argument. These include:

Plots may also be adorned with errorbars or tables.

Bar plots

For labeled, non-time series data, you may wish to produce a bar plot:

In [17]: plt.figure();

In [18]: df.iloc[5].plot.bar()
Out[18]: <matplotlib.axes._subplots.AxesSubplot at 0x7f65da446a90>

In [19]: plt.axhline(0, color='k');
../_images/bar_plot_ex.png

Calling a DataFrame’s plot.bar() method produces a multiple bar plot:

In [20]: df2 = pd.DataFrame(np.random.rand(10, 4), columns=['a', 'b', 'c', 'd'])

In [21]: df2.plot.bar();
../_images/bar_plot_multi_ex.png

To produce a stacked bar plot, pass stacked=True:

In [22]: df2.plot.bar(stacked=True);
../_images/bar_plot_stacked_ex.png

To get horizontal bar plots, use the barh method:

In [23]: df2.plot.barh(stacked=True);
../_images/barh_plot_stacked_ex.png

Histograms

Histograms can be drawn by using the DataFrame.plot.hist() and Series.plot.hist() methods.

In [24]: df4 = pd.DataFrame({'a': np.random.randn(1000) + 1, 'b': np.random.randn(1000),
   ....:                     'c': np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])
   ....: 

In [25]: plt.figure();

In [26]: df4.plot.hist(alpha=0.5)
Out[26]: <matplotlib.axes._subplots.AxesSubplot at 0x7f65da345e48>
../_images/hist_new.png

A histogram can be stacked using stacked=True. Bin size can be changed using the bins keyword.

In [27]: plt.figure();

In [28]: df4.plot.hist(stacked=True, bins=20)
Out[28]: <matplotlib.axes._subplots.AxesSubplot at 0x7f65da30b9b0>
../_images/hist_new_stacked.png

You can pass other keywords supported by matplotlib hist. For example, horizontal and cumulative histograms can be drawn by orientation='horizontal' and cumulative=True.

In [29]: plt.figure();

In [30]: df4['a'].plot.hist(orientation='horizontal', cumulative=True)
Out[30]: <matplotlib.axes._subplots.AxesSubplot at 0x7f65da69fd68>
../_images/hist_new_kwargs.png

See the hist method and the matplotlib hist documentation for more.

The existing interface DataFrame.hist to plot histogram still can be used.

In [31]: plt.figure();

In [32]: df['A'].diff().hist()
Out[32]: <matplotlib.axes._subplots.AxesSubplot at 0x7f65dac9d240>
../_images/hist_plot_ex.png

DataFrame.hist() plots the histograms of the columns on multiple subplots:

In [33]: plt.figure()
Out[33]: <Figure size 640x480 with 0 Axes>

In [34]: df.diff().hist(color='k', alpha=0.5, bins=50)

© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/0.25.0/user_guide/visualization.html