W3cubDocs

pandas.Series.plot.kde

Series.plot.kde(self, bw_method=None, ind=None, **kwargs) [source]

Generate Kernel Density Estimate plot using Gaussian kernels.

In statistics, kernel density estimation (KDE) is a non-parametric way to estimate the probability density function (PDF) of a random variable. This function uses Gaussian kernels and includes automatic bandwidth determination.

Parameters:

Parameters:	`bw_method : str, scalar or callable, optional` The method used to calculate the estimator bandwidth. This can be ‘scott’, ‘silverman’, a scalar constant or a callable. If None (default), ‘scott’ is used. See `scipy.stats.gaussian_kde` for more information. `ind : NumPy array or integer, optional` Evaluation points for the estimated PDF. If None (default), 1000 equally spaced points are used. If `ind` is a NumPy array, the KDE is evaluated at the points passed. If `ind` is an integer, `ind` number of equally spaced points are used. `**kwds : optional` Additional keyword arguments are documented in `pandas.%(this-datatype)s.plot()`.
Returns:	matplotlib.axes.Axes or numpy.ndarray of them

bw_method : str, scalar or callable, optional: The method used to calculate the estimator bandwidth. This can be ‘scott’, ‘silverman’, a scalar constant or a callable. If None (default), ‘scott’ is used. See scipy.stats.gaussian_kde for more information.
ind : NumPy array or integer, optional: Evaluation points for the estimated PDF. If None (default), 1000 equally spaced points are used. If ind is a NumPy array, the KDE is evaluated at the points passed. If ind is an integer, ind number of equally spaced points are used.
**kwds : optional: Additional keyword arguments are documented in pandas.%(this-datatype)s.plot().

Returns:

matplotlib.axes.Axes or numpy.ndarray of them

Examples

Given a Series of points randomly sampled from an unknown distribution, estimate its PDF using KDE with automatic bandwidth determination and plot the results, evaluating them at 1000 equally spaced points (default):

>>> s = pd.Series([1, 2, 2.5, 3, 3.5, 4, 5])
>>> ax = s.plot.kde()

../../_images/pandas-Series-plot-kde-1.png

A scalar bandwidth can be specified. Using a small bandwidth value can lead to over-fitting, while using a large bandwidth value may result in under-fitting:

>>> ax = s.plot.kde(bw_method=0.3)

../../_images/pandas-Series-plot-kde-2.png

>>> ax = s.plot.kde(bw_method=3)

../../_images/pandas-Series-plot-kde-3.png

Finally, the ind parameter determines the evaluation points for the plot of the estimated PDF:

>>> ax = s.plot.kde(ind=[1, 2, 3, 4, 5])

../../_images/pandas-Series-plot-kde-4.png

For DataFrame, it works in the same way:

>>> df = pd.DataFrame({
...     'x': [1, 2, 2.5, 3, 3.5, 4, 5],
...     'y': [4, 4, 4.5, 5, 5.5, 6, 6],
... })
>>> ax = df.plot.kde()

../../_images/pandas-Series-plot-kde-5.png

A scalar bandwidth can be specified. Using a small bandwidth value can lead to over-fitting, while using a large bandwidth value may result in under-fitting:

>>> ax = df.plot.kde(bw_method=0.3)

../../_images/pandas-Series-plot-kde-6.png

>>> ax = df.plot.kde(bw_method=3)

../../_images/pandas-Series-plot-kde-7.png

Finally, the ind parameter determines the evaluation points for the plot of the estimated PDF:

>>> ax = df.plot.kde(ind=[1, 2, 3, 4, 5, 6])

../../_images/pandas-Series-plot-kde-8.png

© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/0.25.0/reference/api/pandas.Series.plot.kde.html