Panda 13 - window function (rolling, expanding)

Article directory

Panda 13 - window function (rolling, expanding)

1, About window functions
2, Using window functions on DataFrame

1,`.rolling() `

Common usage

2 `.expanding() `
3,`.ewm()`

Reprinted and adapted from:

https://www.yiibai.com/pandas/python_pandas_window_functions.html

1, About window functions

Window function is mainly used to find the trend in the data graphically by smoothing the curve.

Pandas provides several variations, such as scrolling, expanding, and exponentially moving window statistics for weight.

It includes sum, mean, median, variance, covariance, correlation, etc.

Let's learn how to apply each of the methods mentioned on the DataFrame object.

2, Using window functions on DataFrame

1,.rolling()

Method structure:

DataFrame.rolling(window, min_periods=None, center=False, win_type=None, 
                  on=None, axis=0, closed=None)

Parameter description

parameter	describe
window	Represents the size of the time window in two forms: 1) If the value int is used, the number of observations, i.e. forward data, will be represented; 2) You can also use the offset type, which is more complex and uses fewer scenarios. I will not introduce it here;
min_periods	The minimum number of observations per window is NA. Value can be int, default None. In the case of offset, the default value is 1;
center	Set window label to center, Boolean, default False, right
win_type	The type of window. Intercepts various functions of the window. String type, default to None;
on	Optional parameters. For dataframe, specify the columns to calculate the scrolling window. The value is the column name.
axis	The default value is 0, which means the column is calculated
closed	ca defines the opening and closing of interval, and supports window s of type int. For the offset type, the default is left open right. You can specify left, both, and so on as appropriate.

This function can be applied to a series of data. Specify the window=n parameter and apply the appropriate statistical function to it.

When the window size is 3(window), the first two elements have empty values, and the value of the third element will be the average of N, n-1 and n-2 elements. In this way, the functions mentioned above can also be applied.

df = pd.DataFrame([[1,3,5,7],[2,4,6,8],
                   [11,18,13,21],[12,34,28,76]],
                  columns = ['A', 'B', 'C', 'D'])

print (df.rolling(window=1).mean())
'''
      A     B     C     D
0   1.0   3.0   5.0   7.0
1   2.0   4.0   6.0   8.0
2  11.0  18.0  13.0  21.0
3  12.0  34.0  28.0  76.0
'''


print (df.rolling(window=2).mean())
'''
      A     B     C     D
0   NaN   NaN   NaN   NaN
1   1.5   3.5   5.5   7.5
2   6.5  11.0   9.5  14.5
3  11.5  26.0  20.5  48.5
'''

print (df.rolling(window=3).mean())
'''
          A          B          C     D
0       NaN        NaN        NaN   NaN
1       NaN        NaN        NaN   NaN
2  4.666667   8.333333   8.000000  12.0
3  8.333333  18.666667  15.666667  35.0
'''

Common usage

The rolling() function supports many functions besides mean(), such as:
count() number of non null observations
Sum of sum() values
Arithmetic mean value of the value of median()
min() min
max() max
std() Bessel corrected sample standard deviation
var() has no deviation
skew() of sample (third moment)
kurt() sample kurtosis (fourth moment)
quantile() sample quantile (value in percentile)
cov() unbiased covariance (binary)
corr() correlation (binary)
With the help of the agg () function, multiple clustering functions can be implemented quickly, and the results can be output, and can be renamed at the same time;

For reference: https://www.jianshu.com/p/b8c795345e93

2 .expanding()

Method structure:

DataFrame.expanding(min_periods = 1，center = False，axis = 0)

The parameter of expanding() function is the same as that of rolling();

rolling() function is to fix the window size and perform sliding calculation. expanding() function only sets the minimum number of observation values, and does not fix the window size to achieve cumulative calculation, that is, continuous expansion;

The expanding() function is similar to the cumulative summation of the cumsum() function, which has the advantage that more clustering can be done;

In fact, when the rolling() function has the parameter window=len(df), the effect is the same as the expanding() function.

df = pd.DataFrame([[1,3,5,7],[2,4,6,8],
                   [11,18,13,21],[12,34,28,76]],
                  columns = ['A', 'B', 'C', 'D'])

print (df)
'''
      A     B     C     D
0   1.0   3.0   5.0   7.0
1   2.0   4.0   6.0   8.0
2  11.0  18.0  13.0  21.0
3  12.0  34.0  28.0  76.0
'''

print (df.expanding(min_periods=1).mean())
'''
          A          B     C     D
0  1.000000   3.000000   5.0   7.0
1  1.500000   3.500000   5.5   7.5
2  4.666667   8.333333   8.0  12.0
3  6.500000  14.750000  13.0  28.0
'''

print (df.expanding(min_periods=2).mean())
'''
          A          B     C     D
0       NaN        NaN   NaN   NaN
1  1.500000   3.500000   5.5   7.5
2  4.666667   8.333333   8.0  12.0
3  6.500000  14.750000  13.0  28.0
'''

print (df.expanding(min_periods=3).mean())
'''
          A          B     C     D
0       NaN        NaN   NaN   NaN
1       NaN        NaN   NaN   NaN
2  4.666667   8.333333   8.0  12.0
3  6.500000  14.750000  13.0  28.0
'''

3,.ewm()

ewm() can be applied to a series of data. It represents exponential weighted sliding, with few scenarios.

Specify com, span, halflife parameters, and apply appropriate statistical functions to them. It assigns weights in the form of indices.

df = pd.DataFrame([[1,3,5,7],[2,4,6,8],
                   [11,18,13,21],[12,34,28,76]],
                  columns = ['A', 'B', 'C', 'D'])

print (df)
'''
    A   B   C   D
0   1   3   5   7
1   2   4   6   8
2  11  18  13  21
3  12  34  28  76
'''

print (df.ewm(com=0.5).mean())
'''
           A          B          C          D
0   1.000000   3.000000   5.000000   7.000000
1   1.750000   3.750000   5.750000   7.750000
2   8.153846  13.615385  10.769231  16.923077
3  10.750000  27.375000  22.400000  56.800000
'''

devwalks

Published 15 original articles, won praise 1, visited 1593

Private letter follow

Posted by ingoruberg on Sun, 26 Jan 2020 23:32:24 -0800

Programmer Group

Panda 13 - window function (rolling, expanding)

Panda 13 - window function (rolling, expanding)

Article directory

1, About window functions

2, Using window functions on DataFrame

1,.rolling()

Common usage

2 .expanding()

3,.ewm()

Hot Keywords