W3cubDocs

/pandas 0.25

pandas.Series.rank

Series.rank(self, axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False) [source]

Compute numerical data ranks (1 through n) along axis.

By default, equal values are assigned a rank that is the average of the ranks of those values.

Parameters:
axis : {0 or ‘index’, 1 or ‘columns’}, default 0

Index to direct ranking.

method : {‘average’, ‘min’, ‘max’, ‘first’, ‘dense’}, default ‘average’

How to rank the group of records that have the same value (i.e. ties):

  • average: average rank of the group
  • min: lowest rank in the group
  • max: highest rank in the group
  • first: ranks assigned in order they appear in the array
  • dense: like ‘min’, but rank always increases by 1 between groups
numeric_only : bool, optional

For DataFrame objects, rank only numeric columns if set to True.

na_option : {‘keep’, ‘top’, ‘bottom’}, default ‘keep’

How to rank NaN values:

  • keep: assign NaN rank to NaN values
  • top: assign smallest rank to NaN values if ascending
  • bottom: assign highest rank to NaN values if ascending
ascending : bool, default True

Whether or not the elements should be ranked in ascending order.

pct : bool, default False

Whether or not to display the returned rankings in percentile form.

Returns:
same type as caller

Return a Series or DataFrame with data ranks as values.

See also

core.groupby.GroupBy.rank
Rank of values within each group.

Examples

>>> df = pd.DataFrame(data={'Animal': ['cat', 'penguin', 'dog',
...                                    'spider', 'snake'],
...                         'Number_legs': [4, 2, 4, 8, np.nan]})
>>> df
    Animal  Number_legs
0      cat          4.0
1  penguin          2.0
2      dog          4.0
3   spider          8.0
4    snake          NaN

The following example shows how the method behaves with the above parameters:

  • default_rank: this is the default behaviour obtained without using any parameter.
  • max_rank: setting method = 'max' the records that have the same values are ranked using the highest rank (e.g.: since ‘cat’ and ‘dog’ are both in the 2nd and 3rd position, rank 3 is assigned.)
  • NA_bottom: choosing na_option = 'bottom', if there are records with NaN values they are placed at the bottom of the ranking.
  • pct_rank: when setting pct = True, the ranking is expressed as percentile rank.
>>> df['default_rank'] = df['Number_legs'].rank()
>>> df['max_rank'] = df['Number_legs'].rank(method='max')
>>> df['NA_bottom'] = df['Number_legs'].rank(na_option='bottom')
>>> df['pct_rank'] = df['Number_legs'].rank(pct=True)
>>> df
    Animal  Number_legs  default_rank  max_rank  NA_bottom  pct_rank
0      cat          4.0           2.5       3.0        2.5     0.625
1  penguin          2.0           1.0       1.0        1.0     0.250
2      dog          4.0           2.5       3.0        2.5     0.625
3   spider          8.0           4.0       4.0        4.0     1.000
4    snake          NaN           NaN       NaN        5.0       NaN

© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/0.25.0/reference/api/pandas.Series.rank.html