API
Dataframe
|
Parallel Pandas DataFrame |
Return a Series/DataFrame with absolute numeric value of each element. |
|
|
Get Addition of dataframe and other, element-wise (binary operator add). |
|
Align two objects on their axes with the specified join method. |
|
Return whether all elements are True, potentially over an axis. |
|
Return whether any element is True, potentially over an axis. |
|
Append rows of other to the end of caller, returning a new object. |
|
Parallel version of pandas.DataFrame.apply |
|
Apply a function to a Dataframe elementwise. |
|
Assign new columns to a DataFrame. |
|
Cast a pandas object to a specified dtype |
|
|
|
Convert columns of the DataFrame to category dtype. |
|
Compute this dask collection |
|
Make a copy of the dataframe |
|
Compute pairwise correlation of columns, excluding NA/null values. |
|
Count non-NA cells for each column or row. |
|
Compute pairwise covariance of columns, excluding NA/null values. |
|
Return cumulative maximum over a DataFrame or Series axis. |
|
Return cumulative minimum over a DataFrame or Series axis. |
|
Return cumulative product over a DataFrame or Series axis. |
|
Return cumulative sum over a DataFrame or Series axis. |
|
Generate descriptive statistics. |
|
First discrete difference of element. |
|
Get Floating division of dataframe and other, element-wise (binary operator truediv). |
|
Get Floating division of dataframe and other, element-wise (binary operator truediv). |
|
Drop specified labels from rows or columns. |
|
Return DataFrame with duplicate rows removed. |
|
Remove missing values. |
Return data types |
|
|
Get Equal to of dataframe and other, element-wise (binary operator eq). |
|
Evaluate a string describing operations on DataFrame columns. |
|
Transform each element of a list-like to a row, replicating index values. |
|
|
|
Fill NA/NaN values using the specified method. |
|
Select initial periods of time series data based on a date offset. |
|
Get Integer division of dataframe and other, element-wise (binary operator floordiv). |
|
Get Greater than or equal to of dataframe and other, element-wise (binary operator ge). |
Get a dask DataFrame/Series representing the nth partition. |
|
|
Group DataFrame using a mapper or by a Series of columns. |
|
Get Greater than of dataframe and other, element-wise (binary operator gt). |
|
First n rows of the dataset |
|
Return index of first occurrence of maximum over requested axis. |
|
Return index of first occurrence of minimum over requested axis. |
Purely integer-location based indexing for selection by position. |
|
Return dask Index instance |
|
|
Concise summary of a Dask DataFrame. |
|
Whether each element in the DataFrame is contained in values. |
Detect missing values. |
|
Detect missing values. |
|
Iterate over (column name, Series) pairs. |
|
Iterate over DataFrame rows as (index, Series) pairs. |
|
|
Iterate over DataFrame rows as namedtuples. |
|
Join columns of another DataFrame. |
Whether divisions are already known |
|
|
Select final periods of time series data based on a date offset. |
|
Get Less than or equal to of dataframe and other, element-wise (binary operator le). |
Purely label-location based indexer for selection by label. |
|
|
Get Less than of dataframe and other, element-wise (binary operator lt). |
|
Apply Python function on each DataFrame partition. |
|
|
|
Return the maximum of the values over the requested axis. |
|
Return the mean of the values over the requested axis. |
|
Unpivots a DataFrame from wide format to long format, optionally leaving identifier variables set. |
|
Return the memory usage of each column in bytes. |
Return the memory usage of each partition |
|
|
Merge the DataFrame with another DataFrame |
|
Return the minimum of the values over the requested axis. |
|
Get Modulo of dataframe and other, element-wise (binary operator mod). |
|
Get the mode(s) of each element along the selected axis. |
|
Get Multiplication of dataframe and other, element-wise (binary operator mul). |
Return dimensionality |
|
|
Get Not equal to of dataframe and other, element-wise (binary operator ne). |
|
Return the first n rows ordered by columns in descending order. |
Return number of partitions |
|
|
Return the first n rows ordered by columns in ascending order. |
Slice dataframe by partitions |
|
|
Create a spreadsheet-style pivot table as a DataFrame. |
|
Return item and drop from frame. |
|
Get Exponential power of dataframe and other, element-wise (binary operator pow). |
|
Return the product of the values over the requested axis. |
|
Approximate row-wise and precise column-wise quantiles of DataFrame |
|
Filter dataframe with complex expression |
|
Get Addition of dataframe and other, element-wise (binary operator radd). |
|
Pseudorandomly split dataframe into different pieces row-wise |
|
Get Floating division of dataframe and other, element-wise (binary operator rtruediv). |
|
Alter axes labels. |
|
Repartition dataframe along new divisions |
|
Replace values given in to_replace with value. |
|
Resample time-series data. |
|
Reset the index to the default index. |
|
Get Integer division of dataframe and other, element-wise (binary operator rfloordiv). |
|
Get Modulo of dataframe and other, element-wise (binary operator rmod). |
|
Get Multiplication of dataframe and other, element-wise (binary operator rmul). |
|
Round a DataFrame to a variable number of decimal places. |
|
Get Exponential power of dataframe and other, element-wise (binary operator rpow). |
|
Get Subtraction of dataframe and other, element-wise (binary operator rsub). |
|
Get Floating division of dataframe and other, element-wise (binary operator rtruediv). |
|
Random sample of items |
|
Return a subset of the DataFrame's columns based on the column dtypes. |
|
Return unbiased standard error of the mean over requested axis. |
|
Set the DataFrame index (row labels) using an existing column. |
Return a tuple representing the dimensionality of the DataFrame. |
|
|
Rearrange DataFrame into new partitions |
Size of the Series or DataFrame as a Delayed object. |
|
|
Sort the dataset by a single column. |
|
Squeeze 1 dimensional axis objects into scalars. |
|
Return sample standard deviation over requested axis. |
|
Get Subtraction of dataframe and other, element-wise (binary operator sub). |
|
Return the sum of the values over the requested axis. |
|
Last n rows of the dataset |
|
Create Dask Bag from a Dask DataFrame |
|
Store Dask DataFrame to CSV files |
|
Convert a dask DataFrame to a dask array. |
|
Convert into a list of |
|
Store Dask Dataframe to Hierarchical Data Format (HDF) files |
|
Render a DataFrame as an HTML table. |
|
See dd.to_json docstring for more information |
|
Store Dask.dataframe to Parquet files |
|
Create Dask Array from a Dask Dataframe |
|
Render a DataFrame to a console-friendly tabular output. |
|
See dd.to_sql docstring for more information |
|
Cast to DatetimeIndex of timestamps, at beginning of period. |
|
Get Floating division of dataframe and other, element-wise (binary operator truediv). |
Return a dask.array of the values of this dataframe |
|
|
Return unbiased variance over requested axis. |
|
Render the computation of this object's task graph using graphviz. |
|
Series
|
Parallel Pandas Series |
|
Return Addition of series and other, element-wise (binary operator add). |
|
Align two objects on their axes with the specified join method. |
|
Return whether all elements are True, potentially over an axis. |
|
Return whether any element is True, potentially over an axis. |
|
Concatenate two or more Series. |
|
Parallel version of pandas.Series.apply |
|
Cast a pandas object to a specified dtype |
|
Compute the lag-N autocorrelation. |
|
Return boolean Series equivalent to left <= series <= right. |
|
|
Forget division information |
|
|
|
|
|
|
|
|
Compute this dask collection |
|
Make a copy of the dataframe |
|
Compute correlation with other Series, excluding missing values. |
|
Return number of non-NA/null observations in the Series. |
|
Compute covariance with Series, excluding missing values. |
|
Return cumulative maximum over a DataFrame or Series axis. |
|
Return cumulative minimum over a DataFrame or Series axis. |
|
Return cumulative product over a DataFrame or Series axis. |
|
Return cumulative sum over a DataFrame or Series axis. |
|
Generate descriptive statistics. |
|
First discrete difference of element. |
|
Return Floating division of series and other, element-wise (binary operator truediv). |
|
Return DataFrame with duplicate rows removed. |
Return a new Series with missing values removed. |
|
Namespace of datetime methods |
|
Return data type |
|
|
Return Equal to of series and other, element-wise (binary operator eq). |
Transform each element of a list-like to a row. |
|
|
|
|
Fill NA/NaN values using the specified method. |
|
Select initial periods of time series data based on a date offset. |
|
Return Integer division of series and other, element-wise (binary operator floordiv). |
|
Return Greater than or equal to of series and other, element-wise (binary operator ge). |
Get a dask DataFrame/Series representing the nth partition. |
|
|
Group Series using a mapper or by a Series of columns. |
|
Return Greater than of series and other, element-wise (binary operator gt). |
|
First n rows of the dataset |
|
Return index of first occurrence of maximum over requested axis. |
|
Return index of first occurrence of minimum over requested axis. |
|
Whether elements in Series are contained in values. |
Detect missing values. |
|
Detect missing values. |
|
Lazily iterate over (index, value) tuples. |
|
Whether divisions are already known |
|
|
Select final periods of time series data based on a date offset. |
|
Return Less than or equal to of series and other, element-wise (binary operator le). |
Purely label-location based indexer for selection by label. |
|
|
Return Less than of series and other, element-wise (binary operator lt). |
|
Map values of Series according to input correspondence. |
|
Apply a function to each partition, sharing rows with adjacent partitions. |
|
Apply Python function on each DataFrame partition. |
|
|
|
Return the maximum of the values over the requested axis. |
|
Return the mean of the values over the requested axis. |
|
Return the memory usage of the Series. |
|
Return the memory usage of each partition |
|
Return the minimum of the values over the requested axis. |
|
Return Modulo of series and other, element-wise (binary operator mod). |
|
Return Multiplication of series and other, element-wise (binary operator mul). |
Number of bytes |
|
Return dimensionality |
|
|
Return Not equal to of series and other, element-wise (binary operator ne). |
|
Return the largest n elements. |
Detect existing (non-missing) values. |
|
|
Return the smallest n elements. |
|
Return number of unique elements in the object. |
|
Approximate number of unique rows. |
|
Persist this dask collection into memory |
|
Apply func(self, *args, **kwargs). |
|
Return Exponential power of series and other, element-wise (binary operator pow). |
|
Return the product of the values over the requested axis. |
|
Approximate quantiles of Series |
|
Return Addition of series and other, element-wise (binary operator radd). |
|
Pseudorandomly split dataframe into different pieces row-wise |
|
Return Floating division of series and other, element-wise (binary operator rtruediv). |
|
Generic row-wise reductions. |
|
Repartition dataframe along new divisions |
|
Replace values given in to_replace with value. |
|
Alter Series index labels or name |
|
Resample time-series data. |
|
Reset the index to the default index. |
|
Provides rolling transformations. |
|
Round each value in a Series to the given number of decimals. |
|
Random sample of items |
|
Return unbiased standard error of the mean over requested axis. |
Return a tuple representing the dimensionality of a Series. |
|
|
Shift index by desired number of periods with an optional time freq. |
Size of the Series or DataFrame as a Delayed object. |
|
|
Return sample standard deviation over requested axis. |
Namespace for string methods |
|
|
Return Subtraction of series and other, element-wise (binary operator sub). |
|
Return the sum of the values over the requested axis. |
|
Create a Dask Bag from a Series |
|
Store Dask DataFrame to CSV files |
|
Convert a dask DataFrame to a dask array. |
|
Convert into a list of |
|
Convert Series to DataFrame. |
|
Store Dask Dataframe to Hierarchical Data Format (HDF) files |
|
Render a string representation of the Series. |
|
Cast to DatetimeIndex of timestamps, at beginning of period. |
|
Return Floating division of series and other, element-wise (binary operator truediv). |
|
Return Series of unique values in the object. |
|
Return a Series containing counts of unique values. |
Return a dask.array of the values of this dataframe |
|
|
Return unbiased variance over requested axis. |
|
Render the computation of this object's task graph using graphviz. |
|
Groupby Operations
DataFrame Groupby
|
Aggregate using one or more operations over the specified axis. |
|
Parallel version of pandas GroupBy.apply |
|
Compute count of group, excluding missing values. |
|
Number each item in each group from 0 to the length of that group - 1. |
|
Cumulative product for each group. |
|
Cumulative sum for each group. |
Construct DataFrame from group with provided name. |
|
|
Compute max of group values. |
|
Compute mean of groups, excluding missing values. |
|
Compute min of group values. |
|
Compute group sizes. |
|
Compute standard deviation of groups, excluding missing values. |
|
Compute sum of group values. |
|
Compute variance of groups, excluding missing values. |
|
Compute pairwise covariance of columns, excluding NA/null values. |
|
Compute pairwise correlation of columns, excluding NA/null values. |
|
Compute first of group values. |
|
Compute last of group values. |
|
Return index of first occurrence of minimum over requested axis. |
|
Return index of first occurrence of maximum over requested axis. |
Series Groupby
|
Aggregate using one or more operations over the specified axis. |
|
Parallel version of pandas GroupBy.apply |
|
Compute count of group, excluding missing values. |
|
Number each item in each group from 0 to the length of that group - 1. |
|
Cumulative product for each group. |
|
Cumulative sum for each group. |
Construct DataFrame from group with provided name. |
|
|
Compute max of group values. |
|
Compute mean of groups, excluding missing values. |
|
Compute min of group values. |
|
Return number of unique elements in the group. |
|
Compute group sizes. |
|
Compute standard deviation of groups, excluding missing values. |
|
Compute sum of group values. |
|
Compute variance of groups, excluding missing values. |
|
Compute first of group values. |
|
Compute last of group values. |
|
Return index of first occurrence of minimum over requested axis. |
|
Return index of first occurrence of maximum over requested axis. |
Custom Aggregation
|
User defined groupby-aggregation. |
Rolling Operations
|
Apply a function to each partition, sharing rows with adjacent partitions. |
|
Provides rolling transformations. |
|
Provides rolling transformations. |
|
Calculate the rolling custom aggregation function. |
Calculate the rolling count of non NaN observations. |
|
Calculate the rolling Fisher's definition of kurtosis without bias. |
|
Calculate the rolling maximum. |
|
Calculate the rolling mean. |
|
Calculate the rolling median. |
|
Calculate the rolling minimum. |
|
|
Calculate the rolling quantile. |
Calculate the rolling unbiased skewness. |
|
|
Calculate the rolling standard deviation. |
Calculate the rolling sum. |
|
|
Calculate the rolling variance. |
Create DataFrames
|
Read CSV files into a Dask.DataFrame |
|
Read delimited files into a Dask.DataFrame |
|
Read fixed-width files into a Dask.DataFrame |
|
Read a Parquet file into a Dask DataFrame |
|
Read HDF files into a Dask DataFrame |
|
Create a dataframe from a set of JSON files |
|
Read dataframe from ORC file(s) |
|
Create dataframe from an SQL table. |
|
Read any sliceable array into a Dask Dataframe |
|
Read BColz CTable into a Dask Dataframe |
|
Create a Dask DataFrame from a Dask Array. |
|
Create Dask DataFrame from many Dask Delayed objects |
|
Construct a Dask DataFrame from a Pandas DataFrame |
|
Create Dask Dataframe from a Dask Bag. |
Store DataFrames
|
Store Dask DataFrame to CSV files |
|
Store Dask.dataframe to Parquet files |
|
Store Dask Dataframe to Hierarchical Data Format (HDF) files |
|
Create Dask Array from a Dask Dataframe |
|
Store Dask Dataframe to a SQL table |
|
Write dataframe into JSON text files |
Convert DataFrames
|
Create Dask Bag from a Dask DataFrame |
|
Convert a dask DataFrame to a dask array. |
|
Convert into a list of |
Reshape DataFrames
|
Convert categorical variable into dummy/indicator variables. |
|
Create a spreadsheet-style pivot table as a DataFrame. |
|
Unpivots a DataFrame from wide format to long format, optionally leaving identifier variables set. |
Concatenate DataFrames
|
Merge the DataFrame with another DataFrame |
|
Concatenate DataFrames along rows. |
|
Merge DataFrame or named Series objects with a database-style join. |
|
Perform an asof merge. |
Resampling
|
Class for resampling timeseries data. |
|
Aggregate using one or more operations over the specified axis. |
Compute count of group, excluding missing values. |
|
Compute first of group values. |
|
Compute last of group values. |
|
Compute max of group values. |
|
Compute mean of groups, excluding missing values. |
|
Compute median of groups, excluding missing values. |
|
Compute min of group values. |
|
Return number of unique elements in the group. |
|
Compute open, high, low and close values of a group, excluding missing values. |
|
Compute prod of group values. |
|
Return value at the given quantile. |
|
Compute standard error of the mean of groups, excluding missing values. |
|
Compute group sizes. |
|
Compute standard deviation of groups, excluding missing values. |
|
Compute sum of group values. |
|
Compute variance of groups, excluding missing values. |
Dask Metadata
|
This method creates meta-data based on the type of |
Other functions
|
Compute several dask collections at once. |
|
Apply Python function on each DataFrame partition. |
Convert argument to datetime. |
|
|
Convert argument to a numeric type. |