English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
Pandas reindex operation example
To apply your own function or a function from another library to a Pandas object, you should understand three important methods. The appropriate method depends on whether you want to operate on the entire DataFrame, rows, columns, or elements.
Table function application: pipe() Row or column function application: apply() Element-level function application: applymap()
Custom operations on DataFrames can be performed by passing a function and an appropriate number of parameters as pipeline parameters
For example, to add2A value is added to the DataFrame. The adder function adds two numeric values and returns the sum.
def adder(ele1,ele2) return ele1+ele2
We use custom functions to operate on DataFrames.
df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3']) df.pipe(adder,2)
Let's take a look at the complete program:
import pandas as pd import numpy as np def adder(ele1,ele2) return ele1+ele2 df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3']) df.pipe(adder,2) print(df.apply(np.mean))
Running Results:
col1 col2 col3 0 2.176704 2.219691 1.509360 1 2.222378 2.422167 3.953921 2 2.241096 1.135424 2.696432 3 2.355763 0.376672 1.182570 4 2.308743 2.714767 2.130288
The apply() method can be used to apply any function along the axis of a DataFrame or Panel, similar to descriptive statistical methods, which use an optional axis parameter. By default, the operation is performed by column, treating each column as an array-like form.
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3']) df.apply(np.mean) print(df.apply(np.mean))
Running Results:
col1 -0.288022 col2 1.044839 col3 -0.187009 dtype: float64
By passing the axis parameter, operations can be performed row-wise.
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3']) df.apply(np.mean, axis=1) print(df.apply(np.mean))
Running Results:
col1 0.034093 col2 -0.152672 col3 -0.229728 dtype: float64
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3']) df.apply(lambda x: x.max() - x.min()) print(df.apply(np.mean))
Running Results:
col1 -0.167413 col2 -0.370495 col3 -0.707631 dtype: float64
Not all functions can be vectorized (NumPy arrays neither return another array nor return any value), the applymap() method on DataFrame and the map() method on Series similarly accept any Python function that takes a single value and returns a single value.
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3']) # Custom Function df['col1'].map(lambda x: x*100) print(df.apply(np.mean))
Running Results:
col1 0.480742 col2 0.454185 col3 0.266563 dtype: float64
import pandas as pd import numpy as np # Custom Function df = pd.DataFrame(np.random.randn(5,3)), columns=['col1','col2','col3']) df.applymap(lambda x: x*100) print(df.apply(np.mean))
Running Results:
col1 0.395263 col2 0.204418 col3 -0.795188 dtype: float64