English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

Function Applications in Pandas

Pandas reindex operation example

To apply your own function or a function from another library to a Pandas object, you should understand three important methods. The appropriate method depends on whether you want to operate on the entire DataFrame, rows, columns, or elements.

Table function application: pipe() Row or column function application: apply() Element-level function application: applymap()

Table function application

Custom operations on DataFrames can be performed by passing a function and an appropriate number of parameters as pipeline parameters

Adder function

For example, to add2A value is added to the DataFrame. The adder function adds two numeric values and returns the sum.

  def adder(ele1,ele2)
    return ele1+ele2

We use custom functions to operate on DataFrames.

 df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3'])
 df.pipe(adder,2)

Let's take a look at the complete program:

 import pandas as pd
 import numpy as np
 def adder(ele1,ele2)
    return ele1+ele2
 df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3'])
 df.pipe(adder,2)
 print(df.apply(np.mean))

Running Results:

       col1 col2 col3
 0 2.176704 2.219691 1.509360
 1 2.222378 2.422167 3.953921
 2 2.241096 1.135424 2.696432
 3 2.355763 0.376672 1.182570
 4 2.308743 2.714767 2.130288

Row or column function application

The apply() method can be used to apply any function along the axis of a DataFrame or Panel, similar to descriptive statistical methods, which use an optional axis parameter. By default, the operation is performed by column, treating each column as an array-like form.

Instance 1

 import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3'])
 df.apply(np.mean)
 print(df.apply(np.mean))

Running Results:

 col1 -0.288022
 col2 1.044839
 col3 -0.187009
 dtype: float64

By passing the axis parameter, operations can be performed row-wise.

Instance 2

 import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3'])
 df.apply(np.mean, axis=1)
 print(df.apply(np.mean))

Running Results:

 col1 0.034093
 col2 -0.152672
 col3 -0.229728
 dtype: float64

Instance 3

 import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3'])
 df.apply(lambda x: x.max() - x.min())
 print(df.apply(np.mean))

Running Results:

 col1 -0.167413
 col2 -0.370495
 col3 -0.707631
 dtype: float64

Element-wise Function Application

Not all functions can be vectorized (NumPy arrays neither return another array nor return any value), the applymap() method on DataFrame and the map() method on Series similarly accept any Python function that takes a single value and returns a single value.

Instance 1

 import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3'])
 # Custom Function
 df['col1'].map(lambda x: x*100)
 print(df.apply(np.mean))

Running Results:

 col1 0.480742
 col2 0.454185
 col3 0.266563
 dtype: float64

Instance 2

 import pandas as pd
 import numpy as np
 # Custom Function
 df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3'])
 df.applymap(lambda x: x*100)
 print(df.apply(np.mean))

Running Results:

 col1 0.395263
 col2 0.204418
 col3 -0.795188
 dtype: float64