SQL Operations in Pandas Statistical Functions in Pandas

Window Functions in Pandas

Pandas window function operation examples

To handle numeric data, Pandas provides some variants, such as rolling, expanding, and exponential moving weights for window statistics. This includes sum, mean, median, variance, covariance, correlation, and so on.
Now, we will learn how to apply them separately to DataFrame objects.

.rolling() function

This feature can be applied to a series of data. Specify the window = n parameter and apply appropriate statistical functions to it.

Example

　import　pandas　as　pd
　import　numpy　as　np
　df　=　pd.DataFrame(np.random.randn(10,　4),
　　　　index　=　pd.date_range('1/1/2000',　periods=10),
　　　　columns　=　['A',　'B',　'C',　'D'])
　print(df.rolling(window=3).mean())

The results are as follows:

　　　　　　　　　　　　　　　　　　A                                  B                                  C                                  D
2000-01-01　　　　　　　　NaN                              NaN                              NaN                              NaN
2000-01-02　　　　　　　　NaN                              NaN                              NaN                              NaN
2000-01-03　　　0.434553　　　-0.667940　　　-1.051718　　　-0.826452
2000-01-04　　　0.628267　　　-0.047040　　　-0.287467　　　-0.161110
2000-01-05　　　0.398233　　　　0.003517　　　　0.099126　　　-0.405565
2000-01-06　　　0.641798　　　　0.656184　　　-0.322728　　　　0.428015
2000-01-07　　　0.188403　　　　0.010913　　　-0.708645　　　　0.160932
2000-01-08　　　0.188043　　　-0.253039　　　-0.818125　　　-0.108485
2000-01-09　　　0.682819　　　-0.606846　　　-0.178411　　　-0.404127
2000-01-10　　　0.688583　　　　0.127786　　　　0.513832　　　-1.067156

Since the window size is3Therefore, for the first two elements are empty, starting from the third element, the value is n, n-1and n-2The average of the elements. Therefore, we can also apply the above various functions.

.expanding() function

This feature can be applied to a series of data. Specify the min_periods = n parameter and apply appropriate statistical functions to it.

Example

　import　pandas　as　pd
　import　numpy　as　np
　df　=　pd.DataFrame(np.random.randn(10,　4),
　　　　index　=　pd.date_range('1/1/2000',　periods=10),
　　　　columns　=　['A',　'B',　'C',　'D'])
　print(df.expanding(min_periods=3).mean())

The results are as follows:

　　　　　　　　　　　　　　　　　A                                  B                                  C                                  D
2000-01-01　　　　　　　　NaN                              NaN                              NaN                              NaN
2000-01-02　　　　　　　　NaN                              NaN                              NaN                              NaN
2000-01-03　　　0.434553　　　-0.667940　　　-1.051718　　　-0.826452
2000-01-04　　　0.743328　　　-0.198015　　　-0.852462　　　-0.262547
2000-01-05　　　0.614776　　　-0.205649　　　-0.583641　　　-0.303254
2000-01-06　　　0.538175　　　-0.005878　　　-0.687223　　　-0.199219
2000-01-07　　　0.505503　　　-0.108475　　　-0.790826　　　-0.081056
2000-01-08　　　0.454751　　　-0.223420　　　-0.671572　　　-0.230215
2000-01-09　　　0.586390　　　-0.206201　　　-0.517619　　　-0.267521
2000-01-10　　　0.560427　　　-0.037597　　　-0.399429　　　-0.376886

.ewm() function

ewm Applied to a series of data. Specify any of the com, span, or halflife parameters and apply the appropriate statistical function to them. It distributes weights exponentially.

Example

　import　pandas　as　pd
　import　numpy　as　np
　　
　df　=　pd.DataFrame(np.random.randn(10,　4),
　　　　index　=　pd.date_range('1/1/2000',　periods=10),
　　　　columns　=　['A',　'B',　'C',　'D'])
　print(df.ewm(com=0.5).mean())

The results are as follows:

　　　　　　　　　　　　　　　　　　A                                  B                                  C                                  D
2000-01-01　　　1.088512　　　-0.650942　　　-2.547450　　　-0.566858
2000-01-02　　　0.865131　　　-0.453626　　　-1.137961　　　　0.058747
2000-01-03　　-0.132245　　　-0.807671　　　-0.308308　　　-1.491002
2000-01-04　　　1.084036　　　　0.555444　　　-0.272119　　　　0.480111
2000-01-05　　　0.425682　　　　0.025511　　　　0.239162　　　-0.153290
2000-01-06　　　0.245094　　　　0.671373　　　-0.725025　　　　0.163310
2000-01-07　　　0.288030　　　-0.259337　　　-1.183515　　　　0.473191
2000-01-08　　　0.162317　　　-0.771884　　　-0.285564　　　-0.692001
2000-01-09　　　1.147156　　　-0.302900       0.380851　　　-0.607976
2000-01-10　　　0.600216　　　　0.885614　　　　0.569808　　　-1.110113

Window functions are mainly used to find trends in data in a graphical way by smoothing curves. If there are many data points available and the daily data changes greatly, using samples and plotting is one method, and applying window calculations and drawing on the results is another method. Through these methods, we can smooth curves or trends.

SQL Operations in Pandas Statistical Functions in Pandas

Pandas Tutorial

Window Functions in Pandas

.rolling() function

.expanding() function

.ewm() function