English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
There are two ways to sort in Pandas:
Sort by label Sort by actual value
Let's look at the following example.
import pandas as pd import numpy as np unsorted_df=pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu mns=['col2','col1']) print(unsorted_df)
Running Result:
col2 col1 1 -2.063177 0.537527 4 0.142932 -0.684884 6 0.012667 -0.389340 2 -0.548797 1.848743 3 -1.044160 0.837381 5 0.385605 1.300185 9 1.031425 -1.002967 8 -0.407374 -0.435142 0 2.237453 -1.067139 7 -1.445831 -1.701035
In unsorted_df, the labels and values are not sorted. Let's see how to sort them.
Using the sort_index() method, you can sort a DataFrame by passing the axis parameter and the sorting order. By default, the row labels are sorted in ascending order.
import pandas as pd import numpy as np unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu mns = ['col2','col1']) sorted_df=unsorted_df.sort_index() print(sorted_df)
Running Result:
col2 col1 9 0.825697 0.374463 8 -1.699509 0.510373 7 -0.581378 0.622958 6 -0.202951 0.954300 5 -1.289321 -1.551250 4 1.302561 0.851385 3 -0.157915 -0.388659 2 -1.222295 0.166609 1 0.584890 -0.291048 0 0.668444 -0.061294
By passing a boolean value to the ascending parameter, you can control the order of sorting. Let's consider the following example to understand the same situation.
import pandas as pd import numpy as np unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu mns = ['col2','col1']) sorted_df = unsorted_df.sort_index(ascending=False) print(sorted_df)
Running Result:
col2 col1 9 0.825697 0.374463 8 -1.699509 0.510373 7 -0.581378 0.622958 6 -0.202951 0.954300 5 -1.289321 -1.551250 4 1.302561 0.851385 3 -0.157915 -0.388659 2 -1.222295 0.166609 1 0.584890 -0.291048 0 0.668444 -0.061294
By passing the axis parameter to 0 or1which can be sorted by column labels. By default, axis = 0 sorts by rows. Let's consider the following example to understand the same situation.
import pandas as pd import numpy as np unsorted_df = pd.DataFrame(np.random.randn(10,2),index=[1,4,6,2,3,5,9,8,0,7],colu mns = ['col2','col1']) sorted_df = unsorted_df.sort_index(axis=1) print(sorted_df)
Running Result:
col1 col2 1 -0.291048 0.584890 4 0.851385 1.302561 6 0.954300 -0.202951 2 0.166609 -1.222295 3 -0.388659 -0.157915 5 -1.551250 -1.289321 9 0.374463 0.825697 8 0.510373 -1.699509 0 -0.061294 0.668444 7 0.622958 -0.581378
Similar to index sorting, sort_values() is a method for sorting by value. It accepts a 'by' parameter that uses the column name of the DataFrame to be sorted by the value.
import pandas as pd import numpy as np unsorted_df = pd.DataFrame({'col1':'[2,1,1,1],'col2':'[1,3,2,4']) sorted_df = unsorted_df.sort_values(by='col1') print(sorted_df)
Running Result:
col1 col2 1 1 3 2 1 2 3 1 4 0 2 1
Note that col1values are sorted, and the corresponding col2values and row indices will be associated with col1together. Therefore, they do not look classified.
'by' The parameters adopt a list of column values.
import pandas as pd import numpy as np unsorted_df = pd.DataFrame({'col1':'[2,1,1,1],'col2':'[1,3,2,4']) sorted_df = unsorted_df.sort_values(by=['col1','col2']) print(sorted_df)
Running Result:
col1 col2 2 1 2 1 1 3 3 1 4 0 2 1
sort_values() Specified the selection of algorithms from mergesort, heapsort, and quicksort.
import pandas as pd import numpy as np unsorted_df = pd.DataFrame({'col1':'[2,1,1,1],'col2':'[1,3,2,4']) sorted_df = unsorted_df.sort_values(by='col1''', kind='mergesort') print(sorted_df)
Running Result:
col1 col2 1 1 3 2 1 2 3 1 4 0 2 1