Series.replace(self, to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad')
[source]
Replace values given in to_replace
with value
.
Values of the Series are replaced with other values dynamically. This differs from updating with .loc
or .iloc
, which require you to specify a location to update with some value.
Parameters: |
|
---|---|
Returns: |
|
Raises: |
|
See also
Series.fillna
Series.where
Series.str.replace
re.sub
. The rules for substitution for re.sub
are the same.to_replace
value, it is like key(s) in the dict are the to_replace part and value(s) in the dict are the value parameter.Scalar `to_replace` and `value`
>>> s = pd.Series([0, 1, 2, 3, 4]) >>> s.replace(0, 5) 0 5 1 1 2 2 3 3 4 4 dtype: int64
>>> df = pd.DataFrame({'A': [0, 1, 2, 3, 4], ... 'B': [5, 6, 7, 8, 9], ... 'C': ['a', 'b', 'c', 'd', 'e']}) >>> df.replace(0, 5) A B C 0 5 5 a 1 1 6 b 2 2 7 c 3 3 8 d 4 4 9 e
List-like `to_replace`
>>> df.replace([0, 1, 2, 3], 4) A B C 0 4 5 a 1 4 6 b 2 4 7 c 3 4 8 d 4 4 9 e
>>> df.replace([0, 1, 2, 3], [4, 3, 2, 1]) A B C 0 4 5 a 1 3 6 b 2 2 7 c 3 1 8 d 4 4 9 e
>>> s.replace([1, 2], method='bfill') 0 0 1 3 2 3 3 3 4 4 dtype: int64
dict-like `to_replace`
>>> df.replace({0: 10, 1: 100}) A B C 0 10 5 a 1 100 6 b 2 2 7 c 3 3 8 d 4 4 9 e
>>> df.replace({'A': 0, 'B': 5}, 100) A B C 0 100 100 a 1 1 6 b 2 2 7 c 3 3 8 d 4 4 9 e
>>> df.replace({'A': {0: 100, 4: 400}}) A B C 0 100 5 a 1 1 6 b 2 2 7 c 3 3 8 d 4 400 9 e
Regular expression `to_replace`
>>> df = pd.DataFrame({'A': ['bat', 'foo', 'bait'], ... 'B': ['abc', 'bar', 'xyz']}) >>> df.replace(to_replace=r'^ba.$', value='new', regex=True) A B 0 new abc 1 foo new 2 bait xyz
>>> df.replace({'A': r'^ba.$'}, {'A': 'new'}, regex=True) A B 0 new abc 1 foo bar 2 bait xyz
>>> df.replace(regex=r'^ba.$', value='new') A B 0 new abc 1 foo new 2 bait xyz
>>> df.replace(regex={r'^ba.$': 'new', 'foo': 'xyz'}) A B 0 new abc 1 xyz new 2 bait xyz
>>> df.replace(regex=[r'^ba.$', 'foo'], value='new') A B 0 new abc 1 new new 2 bait xyz
Note that when replacing multiple bool
or datetime64
objects, the data types in the to_replace
parameter must match the data type of the value being replaced:
>>> df = pd.DataFrame({'A': [True, False, True], ... 'B': [False, True, False]}) >>> df.replace({'a string': 'new value', True: False}) # raises Traceback (most recent call last): ... TypeError: Cannot compare types 'ndarray(dtype=bool)' and 'str'
This raises a TypeError
because one of the dict
keys is not of the correct type for replacement.
Compare the behavior of s.replace({'a': None})
and s.replace('a', None)
to understand the peculiarities of the to_replace
parameter:
>>> s = pd.Series([10, 'a', 'a', 'b', 'a'])
When one uses a dict as the to_replace
value, it is like the value(s) in the dict are equal to the value
parameter. s.replace({'a': None})
is equivalent to s.replace(to_replace={'a': None}, value=None, method=None)
:
>>> s.replace({'a': None}) 0 10 1 None 2 None 3 b 4 None dtype: object
When value=None
and to_replace
is a scalar, list or tuple, replace
uses the method parameter (default ‘pad’) to do the replacement. So this is why the ‘a’ values are being replaced by 10 in rows 1 and 2 and ‘b’ in row 4 in this case. The command s.replace('a', None)
is actually equivalent to s.replace(to_replace='a', value=None, method='pad')
:
>>> s.replace('a', None) 0 10 1 10 2 10 3 b 4 b dtype: object
© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/0.25.0/reference/api/pandas.Series.replace.html