class pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)
[source]
Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.
Parameters: |
|
---|
See also
DataFrame.from_records
DataFrame.from_dict
DataFrame.from_items
Constructing DataFrame from a dictionary.
>>> d = {'col1': [1, 2], 'col2': [3, 4]} >>> df = pd.DataFrame(data=d) >>> df col1 col2 0 1 3 1 2 4
Notice that the inferred dtype is int64.
>>> df.dtypes col1 int64 col2 int64 dtype: object
To enforce a single dtype:
>>> df = pd.DataFrame(data=d, dtype=np.int8) >>> df.dtypes col1 int8 col2 int8 dtype: object
Constructing DataFrame from numpy ndarray:
>>> df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), ... columns=['a', 'b', 'c']) >>> df2 a b c 0 1 2 3 1 4 5 6 2 7 8 9
T | Transpose index and columns. |
at | Access a single value for a row/column label pair. |
axes | Return a list representing the axes of the DataFrame. |
blocks | (DEPRECATED) Internal property, property synonym for as_blocks(). |
columns | The column labels of the DataFrame. |
dtypes | Return the dtypes in the DataFrame. |
empty | Indicator whether DataFrame is empty. |
ftypes | (DEPRECATED) Return the ftypes (indication of sparse/dense and dtype) in DataFrame. |
iat | Access a single value for a row/column pair by integer position. |
iloc | Purely integer-location based indexing for selection by position. |
index | The index (row labels) of the DataFrame. |
is_copy | Return the copy. |
ix | A primarily label-location based indexer, with integer position fallback. |
loc | Access a group of rows and columns by label(s) or a boolean array. |
ndim | Return an int representing the number of axes / array dimensions. |
shape | Return a tuple representing the dimensionality of the DataFrame. |
size | Return an int representing the number of elements in this object. |
style | Property returning a Styler object containing methods for building a styled HTML representation fo the DataFrame. |
values | Return a Numpy representation of the DataFrame. |
abs (self) | Return a Series/DataFrame with absolute numeric value of each element. |
add (self, other[, axis, level, fill_value]) | Get Addition of dataframe and other, element-wise (binary operator add ). |
add_prefix (self, prefix) | Prefix labels with string prefix . |
add_suffix (self, suffix) | Suffix labels with string suffix . |
agg (self, func[, axis]) | Aggregate using one or more operations over the specified axis. |
aggregate (self, func[, axis]) | Aggregate using one or more operations over the specified axis. |
align (self, other[, join, axis, level, …]) | Align two objects on their axes with the specified join method for each axis Index. |
all (self[, axis, bool_only, skipna, level]) | Return whether all elements are True, potentially over an axis. |
any (self[, axis, bool_only, skipna, level]) | Return whether any element is True, potentially over an axis. |
append (self, other[, ignore_index, …]) | Append rows of other to the end of caller, returning a new object. |
apply (self, func[, axis, broadcast, raw, …]) | Apply a function along an axis of the DataFrame. |
applymap (self, func) | Apply a function to a Dataframe elementwise. |
as_blocks (self[, copy]) | (DEPRECATED) Convert the frame to a dict of dtype -> Constructor Types that each has a homogeneous dtype. |
as_matrix (self[, columns]) | (DEPRECATED) Convert the frame to its Numpy-array representation. |
asfreq (self, freq[, method, how, normalize, …]) | Convert TimeSeries to specified frequency. |
asof (self, where[, subset]) | Return the last row(s) without any NaNs before where . |
assign (self, \*\*kwargs) | Assign new columns to a DataFrame. |
astype (self, dtype[, copy, errors]) | Cast a pandas object to a specified dtype dtype . |
at_time (self, time[, asof, axis]) | Select values at particular time of day (e.g. |
between_time (self, start_time, end_time[, …]) | Select values between particular times of the day (e.g., 9:00-9:30 AM). |
bfill (self[, axis, inplace, limit, downcast]) | Synonym for DataFrame.fillna() with method='bfill' . |
bool (self) | Return the bool of a single element PandasObject. |
boxplot (self[, column, by, ax, fontsize, …]) | Make a box plot from DataFrame columns. |
clip (self[, lower, upper, axis, inplace]) | Trim values at input threshold(s). |
clip_lower (self, threshold[, axis, inplace]) | (DEPRECATED) Trim values below a given threshold. |
clip_upper (self, threshold[, axis, inplace]) | (DEPRECATED) Trim values above a given threshold. |
combine (self, other, func[, fill_value, …]) | Perform column-wise combine with another DataFrame. |
combine_first (self, other) | Update null elements with value in the same location in other . |
compound (self[, axis, skipna, level]) | (DEPRECATED) Return the compound percentage of the values for the requested axis. |
copy (self[, deep]) | Make a copy of this object’s indices and data. |
corr (self[, method, min_periods]) | Compute pairwise correlation of columns, excluding NA/null values. |
corrwith (self, other[, axis, drop, method]) | Compute pairwise correlation between rows or columns of DataFrame with rows or columns of Series or DataFrame. |
count (self[, axis, level, numeric_only]) | Count non-NA cells for each column or row. |
cov (self[, min_periods]) | Compute pairwise covariance of columns, excluding NA/null values. |
cummax (self[, axis, skipna]) | Return cumulative maximum over a DataFrame or Series axis. |
cummin (self[, axis, skipna]) | Return cumulative minimum over a DataFrame or Series axis. |
cumprod (self[, axis, skipna]) | Return cumulative product over a DataFrame or Series axis. |
cumsum (self[, axis, skipna]) | Return cumulative sum over a DataFrame or Series axis. |
describe (self[, percentiles, include, exclude]) | Generate descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values. |
diff (self[, periods, axis]) | First discrete difference of element. |
div (self, other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator truediv ). |
divide (self, other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator truediv ). |
dot (self, other) | Compute the matrix multiplication between the DataFrame and other. |
drop (self[, labels, axis, index, columns, …]) | Drop specified labels from rows or columns. |
drop_duplicates (self[, subset, keep, inplace]) | Return DataFrame with duplicate rows removed, optionally only considering certain columns. |
droplevel (self, level[, axis]) | Return DataFrame with requested index / column level(s) removed. |
dropna (self[, axis, how, thresh, subset, …]) | Remove missing values. |
duplicated (self[, subset, keep]) | Return boolean Series denoting duplicate rows, optionally only considering certain columns. |
eq (self, other[, axis, level]) | Get Equal to of dataframe and other, element-wise (binary operator eq ). |
equals (self, other) | Test whether two objects contain the same elements. |
eval (self, expr[, inplace]) | Evaluate a string describing operations on DataFrame columns. |
ewm (self[, com, span, halflife, alpha, …]) | Provide exponential weighted functions. |
expanding (self[, min_periods, center, axis]) | Provide expanding transformations. |
explode (self, column, Tuple]) | Transform each element of a list-like to a row, replicating the index values. |
ffill (self[, axis, inplace, limit, downcast]) | Synonym for DataFrame.fillna() with method='ffill' . |
fillna (self[, value, method, axis, inplace, …]) | Fill NA/NaN values using the specified method. |
filter (self[, items, like, regex, axis]) | Subset rows or columns of dataframe according to labels in the specified index. |
first (self, offset) | Convenience method for subsetting initial periods of time series data based on a date offset. |
first_valid_index (self) | Return index for first non-NA/null value. |
floordiv (self, other[, axis, level, fill_value]) | Get Integer division of dataframe and other, element-wise (binary operator floordiv ). |
from_dict (data[, orient, dtype, columns]) | Construct DataFrame from dict of array-like or dicts. |
from_items (items[, columns, orient]) | (DEPRECATED) Construct a DataFrame from a list of tuples. |
from_records (data[, index, exclude, …]) | Convert structured or record ndarray to DataFrame. |
ge (self, other[, axis, level]) | Get Greater than or equal to of dataframe and other, element-wise (binary operator ge ). |
get (self, key[, default]) | Get item from object for given key (ex: DataFrame column). |
get_dtype_counts (self) | (DEPRECATED) Return counts of unique dtypes in this object. |
get_ftype_counts (self) | (DEPRECATED) Return counts of unique ftypes in this object. |
get_value (self, index, col[, takeable]) | (DEPRECATED) Quickly retrieve single value at passed column and index. |
get_values (self) | (DEPRECATED) Return an ndarray after converting sparse values to dense. |
groupby (self[, by, axis, level, as_index, …]) | Group DataFrame or Series using a mapper or by a Series of columns. |
gt (self, other[, axis, level]) | Get Greater than of dataframe and other, element-wise (binary operator gt ). |
head (self[, n]) | Return the first n rows. |
hist (data[, column, by, grid, xlabelsize, …]) | Make a histogram of the DataFrame’s. |
idxmax (self[, axis, skipna]) | Return index of first occurrence of maximum over requested axis. |
idxmin (self[, axis, skipna]) | Return index of first occurrence of minimum over requested axis. |
infer_objects (self) | Attempt to infer better dtypes for object columns. |
info (self[, verbose, buf, max_cols, …]) | Print a concise summary of a DataFrame. |
insert (self, loc, column, value[, …]) | Insert column into DataFrame at specified location. |
interpolate (self[, method, axis, limit, …]) | Interpolate values according to different methods. |
isin (self, values) | Whether each element in the DataFrame is contained in values. |
isna (self) | Detect missing values. |
isnull (self) | Detect missing values. |
items (self) | Iterator over (column name, Series) pairs. |
iteritems (self) | Iterator over (column name, Series) pairs. |
iterrows (self) | Iterate over DataFrame rows as (index, Series) pairs. |
itertuples (self[, index, name]) | Iterate over DataFrame rows as namedtuples. |
join (self, other[, on, how, lsuffix, …]) | Join columns of another DataFrame. |
keys (self) | Get the ‘info axis’ (see Indexing for more) |
kurt (self[, axis, skipna, level, numeric_only]) | Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0). |
kurtosis (self[, axis, skipna, level, …]) | Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0). |
last (self, offset) | Convenience method for subsetting final periods of time series data based on a date offset. |
last_valid_index (self) | Return index for last non-NA/null value. |
le (self, other[, axis, level]) | Get Less than or equal to of dataframe and other, element-wise (binary operator le ). |
lookup (self, row_labels, col_labels) | Label-based “fancy indexing” function for DataFrame. |
lt (self, other[, axis, level]) | Get Less than of dataframe and other, element-wise (binary operator lt ). |
mad (self[, axis, skipna, level]) | Return the mean absolute deviation of the values for the requested axis. |
mask (self, cond[, other, inplace, axis, …]) | Replace values where the condition is True. |
max (self[, axis, skipna, level, numeric_only]) | Return the maximum of the values for the requested axis. |
mean (self[, axis, skipna, level, numeric_only]) | Return the mean of the values for the requested axis. |
median (self[, axis, skipna, level, numeric_only]) | Return the median of the values for the requested axis. |
melt (self[, id_vars, value_vars, var_name, …]) | Unpivot a DataFrame from wide format to long format, optionally leaving identifier variables set. |
memory_usage (self[, index, deep]) | Return the memory usage of each column in bytes. |
merge (self, right[, how, on, left_on, …]) | Merge DataFrame or named Series objects with a database-style join. |
min (self[, axis, skipna, level, numeric_only]) | Return the minimum of the values for the requested axis. |
mod (self, other[, axis, level, fill_value]) | Get Modulo of dataframe and other, element-wise (binary operator mod ). |
mode (self[, axis, numeric_only, dropna]) | Get the mode(s) of each element along the selected axis. |
mul (self, other[, axis, level, fill_value]) | Get Multiplication of dataframe and other, element-wise (binary operator mul ). |
multiply (self, other[, axis, level, fill_value]) | Get Multiplication of dataframe and other, element-wise (binary operator mul ). |
ne (self, other[, axis, level]) | Get Not equal to of dataframe and other, element-wise (binary operator ne ). |
nlargest (self, n, columns[, keep]) | Return the first n rows ordered by columns in descending order. |
notna (self) | Detect existing (non-missing) values. |
notnull (self) | Detect existing (non-missing) values. |
nsmallest (self, n, columns[, keep]) | Return the first n rows ordered by columns in ascending order. |
nunique (self[, axis, dropna]) | Count distinct observations over requested axis. |
pct_change (self[, periods, fill_method, …]) | Percentage change between the current and a prior element. |
pipe (self, func, \*args, \*\*kwargs) | Apply func(self, *args, **kwargs). |
pivot (self[, index, columns, values]) | Return reshaped DataFrame organized by given index / column values. |
pivot_table (self[, values, index, columns, …]) | Create a spreadsheet-style pivot table as a DataFrame. |
plot | alias of pandas.plotting._core.PlotAccessor
|
pop (self, item) | Return item and drop from frame. |
pow (self, other[, axis, level, fill_value]) | Get Exponential power of dataframe and other, element-wise (binary operator pow ). |
prod (self[, axis, skipna, level, …]) | Return the product of the values for the requested axis. |
product (self[, axis, skipna, level, …]) | Return the product of the values for the requested axis. |
quantile (self[, q, axis, numeric_only, …]) | Return values at the given quantile over requested axis. |
query (self, expr[, inplace]) | Query the columns of a DataFrame with a boolean expression. |
radd (self, other[, axis, level, fill_value]) | Get Addition of dataframe and other, element-wise (binary operator radd ). |
rank (self[, axis, method, numeric_only, …]) | Compute numerical data ranks (1 through n) along axis. |
rdiv (self, other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator rtruediv ). |
reindex (self[, labels, index, columns, …]) | Conform DataFrame to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. |
reindex_like (self, other[, method, copy, …]) | Return an object with matching indices as other object. |
rename (self[, mapper, index, columns, axis, …]) | Alter axes labels. |
rename_axis (self[, mapper, index, columns, …]) | Set the name of the axis for the index or columns. |
reorder_levels (self, order[, axis]) | Rearrange index levels using input order. |
replace (self[, to_replace, value, inplace, …]) | Replace values given in to_replace with value . |
resample (self, rule[, how, axis, …]) | Resample time-series data. |
reset_index (self[, level, drop, inplace, …]) | Reset the index, or a level of it. |
rfloordiv (self, other[, axis, level, fill_value]) | Get Integer division of dataframe and other, element-wise (binary operator rfloordiv ). |
rmod (self, other[, axis, level, fill_value]) | Get Modulo of dataframe and other, element-wise (binary operator rmod ). |
rmul (self, other[, axis, level, fill_value]) | Get Multiplication of dataframe and other, element-wise (binary operator rmul ). |
rolling (self, window[, min_periods, center, …]) | Provide rolling window calculations. |
round (self[, decimals]) | Round a DataFrame to a variable number of decimal places. |
rpow (self, other[, axis, level, fill_value]) | Get Exponential power of dataframe and other, element-wise (binary operator rpow ). |
rsub (self, other[, axis, level, fill_value]) | Get Subtraction of dataframe and other, element-wise (binary operator rsub ). |
rtruediv (self, other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator rtruediv ). |
sample (self[, n, frac, replace, weights, …]) | Return a random sample of items from an axis of object. |
select_dtypes (self[, include, exclude]) | Return a subset of the DataFrame’s columns based on the column dtypes. |
sem (self[, axis, skipna, level, ddof, …]) | Return unbiased standard error of the mean over requested axis. |
set_axis (self, labels[, axis, inplace]) | Assign desired index to given axis. |
set_index (self, keys[, drop, append, …]) | Set the DataFrame index using existing columns. |
set_value (self, index, col, value[, takeable]) | (DEPRECATED) Put single value at passed column and index. |
shift (self[, periods, freq, axis, fill_value]) | Shift index by desired number of periods with an optional time freq . |
skew (self[, axis, skipna, level, numeric_only]) | Return unbiased skew over requested axis Normalized by N-1. |
slice_shift (self[, periods, axis]) | Equivalent to shift without copying data. |
sort_index (self[, axis, level, ascending, …]) | Sort object by labels (along an axis). |
sort_values (self, by[, axis, ascending, …]) | Sort by the values along either axis. |
sparse | alias of pandas.core.arrays.sparse.SparseFrameAccessor
|
squeeze (self[, axis]) | Squeeze 1 dimensional axis objects into scalars. |
stack (self[, level, dropna]) | Stack the prescribed level(s) from columns to index. |
std (self[, axis, skipna, level, ddof, …]) | Return sample standard deviation over requested axis. |
sub (self, other[, axis, level, fill_value]) | Get Subtraction of dataframe and other, element-wise (binary operator sub ). |
subtract (self, other[, axis, level, fill_value]) | Get Subtraction of dataframe and other, element-wise (binary operator sub ). |
sum (self[, axis, skipna, level, …]) | Return the sum of the values for the requested axis. |
swapaxes (self, axis1, axis2[, copy]) | Interchange axes and swap values axes appropriately. |
swaplevel (self[, i, j, axis]) | Swap levels i and j in a MultiIndex on a particular axis. |
tail (self[, n]) | Return the last n rows. |
take (self, indices[, axis, is_copy]) | Return the elements in the given positional indices along an axis. |
to_clipboard (self[, excel, sep]) | Copy object to the system clipboard. |
to_csv (self[, path_or_buf, sep, na_rep, …]) | Write object to a comma-separated values (csv) file. |
to_dense (self) | (DEPRECATED) Return dense representation of Series/DataFrame (as opposed to sparse). |
to_dict (self[, orient, into]) | Convert the DataFrame to a dictionary. |
to_excel (self, excel_writer[, sheet_name, …]) | Write object to an Excel sheet. |
to_feather (self, fname) | Write out the binary feather-format for DataFrames. |
to_gbq (self, destination_table[, …]) | Write a DataFrame to a Google BigQuery table. |
to_hdf (self, path_or_buf, key, \*\*kwargs) | Write the contained data to an HDF5 file using HDFStore. |
to_html (self[, buf, columns, col_space, …]) | Render a DataFrame as an HTML table. |
to_json (self[, path_or_buf, orient, …]) | Convert the object to a JSON string. |
to_latex (self[, buf, columns, col_space, …]) | Render an object to a LaTeX tabular environment table. |
to_msgpack (self[, path_or_buf, encoding]) | (DEPRECATED) Serialize object to input file path using msgpack format. |
to_numpy (self[, dtype, copy]) | Convert the DataFrame to a NumPy array. |
to_parquet (self, fname[, engine, …]) | Write a DataFrame to the binary parquet format. |
to_period (self[, freq, axis, copy]) | Convert DataFrame from DatetimeIndex to PeriodIndex with desired frequency (inferred from index if not passed). |
to_pickle (self, path[, compression, protocol]) | Pickle (serialize) object to file. |
to_records (self[, index, …]) | Convert DataFrame to a NumPy record array. |
to_sparse (self[, fill_value, kind]) | (DEPRECATED) Convert to SparseDataFrame. |
to_sql (self, name, con[, schema, if_exists, …]) | Write records stored in a DataFrame to a SQL database. |
to_stata (self, fname[, convert_dates, …]) | Export DataFrame object to Stata dta format. |
to_string (self[, buf, columns, col_space, …]) | Render a DataFrame to a console-friendly tabular output. |
to_timestamp (self[, freq, how, axis, copy]) | Cast to DatetimeIndex of timestamps, at beginning of period. |
to_xarray (self) | Return an xarray object from the pandas object. |
transform (self, func[, axis]) | Call func on self producing a DataFrame with transformed values and that has the same axis length as self. |
transpose (self, \*args, \*\*kwargs) | Transpose index and columns. |
truediv (self, other[, axis, level, fill_value]) | Get Floating division of dataframe and other, element-wise (binary operator truediv ). |
truncate (self[, before, after, axis, copy]) | Truncate a Series or DataFrame before and after some index value. |
tshift (self[, periods, freq, axis]) | Shift the time index, using the index’s frequency if available. |
tz_convert (self, tz[, axis, level, copy]) | Convert tz-aware axis to target time zone. |
tz_localize (self, tz[, axis, level, copy, …]) | Localize tz-naive index of a Series or DataFrame to target time zone. |
unstack (self[, level, fill_value]) | Pivot a level of the (necessarily hierarchical) index labels, returning a DataFrame having a new level of column labels whose inner-most level consists of the pivoted index labels. |
update (self, other[, join, overwrite, …]) | Modify in place using non-NA values from another DataFrame. |
var (self[, axis, skipna, level, ddof, …]) | Return unbiased variance over requested axis. |
where (self, cond[, other, inplace, axis, …]) | Replace values where the condition is False. |
xs (self, key[, axis, level, drop_level]) | Return cross-section from the Series/DataFrame. |
© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/0.25.0/reference/api/pandas.DataFrame.html