English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
Rebuild Index It will change the row labels and column labels of the DataFrame. Reindexing is to make the data match a set of given labels on a specific axis.
Multiple operations can be completed through indexing, such as-
Reorder existing data to match a set of new labels.Insert missing value (NA) markers in the label data at the positions where labels do not exist.
import pandas as pd import numpy as np N=20 df = pd.DataFrame({ 'A': pd.date_range(start='2016-01-01, periods=N, freq='D'), 'x': np.linspace(0, stop=N-1, num=N), 'y': np.random.rand(N), 'C': np.random.choice(['Low','Medium','High'], N).tolist(), 'D': np.random.normal(100, 10, size=(N)).tolist() ) # DataFrame reindexing df_reindexed = df.reindex(index=[02,5, columns=['A', 'C', 'B']) print(df_reindexed)
Running Results:
A C B 0 2016-01-01 Low NaN 2 2016-01-03 High NaN 5 2016-01-06 Low NaN
You may want to get an object and reindex its axis to make it marked as the same as another object. Consider the following example to understand the same content.
import pandas as pd import numpy as np df1 = pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3']) df2 = pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3']) df1 = df1.reindex_like(df2) print(df1)
Running Results:
col1 col2 col3 0 -2.467652 -1.211687 -0.391761 1 -0.287396 0.522350 0.562512 2 -0.255409 -0.483250 1.866258 3 -1.150467 -0.646493 -0.222462 4 0.152768 -2.056643 1.877233 5 -1.155997 1.528719 -1.343719 6 -1.015606 -1.245936 -0.295275
Here, df1 DataFrame like df2It is changed and reindexed in the same way. The column names should match, otherwise NAN will be added to the entire column label.
reindex() With the optional parameter method, which is a filling method with the following values
pad/ffill − Fill forward value
bfill/backfill − Fill backward value
nearest − Fill from the nearest index value
import pandas as pd import numpy as np df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3']) df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3']) # Fill NAN print df2.reindex_like(df1) # Fill NAN with the previous value print("DataFrame with forward fill:") print(df2.reindex_like(df1, method='ffill'))
Running Results:
col1 col2 col3 0 1.311620 -0.707176 0.599863 1 -0.423455 -0.700265 1.133371 2 NaN NaN NaN 3 NaN NaN NaN 4 NaN NaN NaN 5 NaN NaN NaN DataFrame with forward fill: col1 col2 col3 0 1.311620 -0.707176 0.599863 1 -0.423455 -0.700265 1.133371 2 -0.423455 -0.700265 1.133371 3 -0.423455 -0.700265 1.133371 4 -0.423455 -0.700265 1.133371 5 -0.423455 -0.700265 1.133371
The last four lines are filled.
The limit parameter provides additional control for filling when reindexing. It specifies the maximum number of consecutive matches. Let's consider the following example to understand the same content-
import pandas as pd import numpy as np df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3']) df2 = pd.DataFrame(np.random.randn(2,3),columns=['col1','col2','col3']) # Fill NAN print df2.reindex_like(df1) # Fill NAN with the previous value1.DataFrame:") print(df2.reindex_like(df1,method='ffill',limit=1))
Running Results:
col1 col2 col3 0 0.247784 2.128727 0.702576 1 -0.055713 -0.021732 -0.174577 2 NaN NaN NaN 3 NaN NaN NaN 4 NaN NaN NaN 5 NaN NaN NaN Forward fill limit is1DataFrame: col1 col2 col3 0 0.247784 2.128727 0.702576 1 -0.055713 -0.021732 -0.174577 2 -0.055713 -0.021732 -0.174577 3 NaN NaN NaN 4 NaN NaN NaN 5 NaN NaN NaN
Note that the sixth line above only filled the seventh line. Then, each row remains the same.
Through the rename() method, you can re-label the axes based on some mapping (dictionary or series) or any function.
Let's consider the following example to understand this-
import pandas as pd import numpy as np df1 = pd.DataFrame(np.random.randn(6,3),columns=['col1','col2','col3']) print df1 print ("After renaming rows and columns:") print(df1.rename(columns={'col1' : 'c1', 'col2' : 'c2'}, index = {0 : 'apple', 1 : 'banana', 2 : 'durian'}))
Running Results:
col1 col2 col3 0 0.486791 0.105759 1.540122 1 -0.990237 1.007885 -0.217896 2 -0.483855 -1.645027 -1.194113 3 -0.122316 0.566277 -0.366028 4 -0.231524 -0.721172 -0.112007 5 0.438810 0.000225 0.435479 After renaming rows and columns: c1 c2 col3 apple 0.486791 0.105759 1.540122 banana -0.990237 1.007885 -0.217896 durian -0.483855 -1.645027 -1.194113 3 -0.122316 0.566277 -0.366028 4 -0.231524 -0.721172 -0.112007 5 0.438810 0.000225 0.435479