English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

Basic Methods of Pandas

Pandas Basic Methods Examples

So far, we have learned about three Pandas DataStructures and how to create them. Due to its importance in real-time data processing, we will mainly focus on DataFrame objects and discuss some other DataStructures.

MethodDescription
axesReturn the list of row axis labels
dtypeReturn the dtype of the object.
emptyIf the Series is empty, return True.
ndimReturn the number of dimensions of the base data according to definition.
sizeReturn the number of elements in the base data.
valuesReturn Series as ndarray.
head()Returns the first n rows.
tail()Returns the last n rows.
Next, we create a Series and look at all the list properties of the above.
 import pandas as pd
 import numpy as np
 # Use100 random numbers to create a Series
 s = pd.Series(np.random.randn(4))
 print(s)

Running Result:

0 0.967853
1  -0.148368
2  -1.395906
3  -1.758394
dtype: float64

axes

Return the list of Series labels

 import pandas as pd
 import numpy as np
 # Use100 random numbers to create a Series
 s = pd.Series(np.random.randn(4))
 print("The axes are:")
 print(s.axes)

Running Result:

 The axes are:
 [RangeIndex(start=0, stop=4, step=1])

The above result is from 0 to5(i.e., [0,1,2,3,4]).

empty

Return a boolean value indicating whether the object is empty. True means the object is empty

 import pandas as pd
 import numpy as np
 # Use100 random numbers to create a Series
 s = pd.Series(np.random.randn(4))
 print("Is the Object empty?")
 print(s.empty)

Running Result:

Is the Object empty?
False

ndim

Return the number of dimensions of the object. According to definition, Series is a1D data structure, so it returns

 import pandas as pd
 import numpy as np
 # Use4Create a Series with a random number
 s = pd.Series(np.random.randn(4))
 print s
 print("The dimensions of the object:")
 print(s.ndim)

Running Result:

     0 0.175898
1   0.166197
2  -0.609712
3  -1.377000
dtype: float64
The dimensions of the object:
1

size

Return the size (length) of the Series.

 import pandas as pd
 import numpy as np
 # Use4Create a Series with a random number
 s = pd.Series(np.random.randn(2))
 print s
 print("The size of the object:")
 print(s.size)

Running Result:

0   3.078058
1  -1.207803
dtype: float64
The size of the object:
2

values

Return Series data in array form

 import pandas as pd
 import numpy as np
 # Use4Create a Series with a random number
 s = pd.Series(np.random.randn(4))
 print s
 print("The actual data series is:")
 print(s.values)

Running Result:

0   1.787373
1  -0.605159
2   0.180477
3  -0.140922
dtype: float64
The actual data series is:
[ 1.78737302 -0.60515881 0.18047664 -0.1409218 ]

Head and Tail

To view the head and tail data of a Series or DataFrame object, please use the head() and tail() methods.

head() Return the first n rows (observation index values). The default number of elements displayed is5However, you can pass custom numbers.

 import pandas as pd
 import numpy as np
 # Use4Create a Series with a random number
 s = pd.Series(np.random.randn(4))
 print("The initial series is:")
 print s
 print("The first two rows of the data series:")
 print(s.head(2))

Running Result:

The original series is:
0 0.720876
1  -0.765898
2   0.479221
3  -0.139547
dtype: float64
The first two rows of the data series:
0 0.720876
1  -0.765898
dtype: float64

tail() Return the last n rows (observe the index value). The default number of elements displayed is5However, you can pass custom numbers.

 import pandas as pd
 import numpy as np
 # Use4Create a Series with a random number
 s = pd.Series(np.random.randn(4))
 print("The original series is:")
 print(s)
 print("The last two rows of the data series:")
 print(s.tail(2)

Running Result:

The original series is:
0 -0.655091
1 -0.881407
2 -0.608592
3 -2.341413
dtype: float64
The last two rows of the data series are:
2 -0.608592
3 -2.341413
dtype: float64

DataFrame Basic Functions

Now let's understand what the basic functions of DataFrame are. The table below lists important attributes or methods that help with the basic functions of DataFrame.

Attribute/MethodDescription
TRow and column are mutually converted
axesReturns a list with unique members of row labels and column labels.
dtypesReturns the dtypes in this object.
emptyIf the NDataFrame is completely empty [no items], then true; otherwise false. If any axis has a length of 0.
ndimNumber of axes/Array size.
shapeReturns a tuple representing the dimensions of the DataFrame.
sizeThe number of elements in the NDataFrame.
valuesNDFrame's numeric representation.
head()Returns the first n rows.
tail()Returns the last n rows.

Next, let's create a DataFrame and view all the ways to operate on the above properties.

Example

 import pandas as pd
 import numpy as np
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]),
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
 # Create a DataFrame
 df = pd.DataFrame(d)
 print("Our data series is:")
 print(df)

Running Result:

Our data series is:
    Age   Name    Rating
0   25    Tom     4.23
1   26    James   3.24
2   25    Ricky   3.98
3   23    Vin     2.56
4   30    Steve   3.20
5   29    Smith   4.60
6   23    Jack    3.80

T (Transpose)

Returns the transpose of the DataFrame. Rows and columns will be swapped.

 import pandas as pd
 import numpy as np
  
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]),
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
 # Create a DataFrame
 df = pd.DataFrame(d)
 print("The transpose of the data series is:")
 print(df.T)

Running Result:

The transpose of the data series is:
         0     1       2      3      4      5       6
Age      25    26      25     23     30     29      23
Name object Tom James Ricky Vin Steve Smith Jack
Rating   4.23  3.24    3.98   2.56   3.2    4.6     3.8

axes

Returns a list of row labels and column labels.

 import pandas as pd
 import numpy as np
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]),
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
 # Create a DataFrame
 df = pd.DataFrame(d)
 print("The row labels and column labels are:")
 print(df.axes)

Running Result:

  The row labels and column labels are:
 [RangeIndex(start=0, stop=7, step=1), Index([u'Age', u'Name', u'Rating'],
 dtype='object')]

dtypes

Returns the data type of each column.

 import pandas as pd
 import numpy as np
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]),
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
 # Create a DataFrame
 df = pd.DataFrame(d)
 print("The data types of each column are as follows:")
 print(df.dtypes)

Running Result:

The data types of each column are as follows:
Age int64
Name object
Rating float64
dtype: object

empty

Returns a boolean value indicating whether the object is empty; True means the object is empty.

 import pandas as pd
 import numpy as np
  
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]),
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
  
 # Create a DataFrame
 df = pd.DataFrame(d)
 print("Is the object empty?")
 print(df.empty)

Running Result:

 Is the object empty?
 False

ndim

Returns the number of objects. According to the definition, DataFrame is2D object.

 import pandas as pd
 import numpy as np
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]),
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
 # Create a DataFrame
 df = pd.DataFrame(d)
 print("Our object is:")
 print df
 print("The dimension of the object is:")
 print(df.ndim)

Running Result:

     Our object is:
      Age Name Rating
0     25     Tom      4.23
1     26     James    3.24
2     25     Ricky    3.98
3     23     Vin      2.56
4     30 Steve    3.20
5     29     Smith    4.60
6     23     Jack     3.80
The dimension of the object is:
2

shape

Returns a tuple representing the dimensions of the DataFrame. Tuple (a, b), where a represents the number of rows, and b represents the number of columns.

 import pandas as pd
 import numpy as np
  
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]),
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
  
 # Create a DataFrame
 df = pd.DataFrame(d)
 print("Our object is:")
 print df
 print("The shape of the object is:")
 print(df.shape)

Running Result:

     Our object is:
   Age   Name    Rating
0  25    Tom     4.23
1  26    James   3.24
2  25    Ricky   3.98
3  23    Vin     2.56
4  30    Steve   3.20
5  29    Smith   4.60
6  23    Jack    3.80
The shape of the object is:
(7, 3)

size

Returns the number of elements in the DataFrame.

 import pandas as pd
 import numpy as np
  
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]),
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
  
 # Create a DataFrame
 df = pd.DataFrame(d)
 print("Our object is:")
 print df
 print("The total number of elements in our object is:")
 print(df.size)

Running Result:

     Our object is:
    Age   Name    Rating
0   25    Tom     4.23
1   26    James   3.24
2   25    Ricky   3.98
3   23    Vin     2.56
4   30    Steve   3.20
5   29    Smith   4.60
6   23    Jack    3.80
The total number of elements in our object is:
21

values

Returns the actual data in the DataFrame in the form of NDarray.

 import pandas as pd
 import numpy as np
  
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]),
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
  
 # Create a DataFrame
 df = pd.DataFrame(d)
 print("Our object is:")
 print df
 print("The actual data in our data frame is:")
 print(df.values)

Running Result:

     Our object is:
    Age   Name    Rating
0   25    Tom     4.23
1   26    James   3.24
2   25    Ricky   3.98
3   23    Vin     2.56
4   30    Steve   3.20
5   29    Smith   4.60
6   23    Jack    3.80
The actual data in our data frame is:
[[25 "Tom" 4.23]
[26 "James" 3.24]
[25 "Ricky" 3.98]
[23 "Vin" 2.56]
[30 "Steve" 3.2]
[29 "Smith" 4.6]
[23 "Jack" 3.8]]

Head & Tail

To view the head and tail data of the DataFrame object, please use the head() and tail() methods. head() returns the first n rows (observing the index value). The default number of elements displayed is5However, you can pass custom numbers.

 import pandas as pd
 import numpy as np
  
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]),
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
 # Create a DataFrame
 df = pd.DataFrame(d)
 print("Our data frame is:")
 print df
 print("The first two rows of the data frame are:")
 print(df.head(2))

Running Result:

     Our data frame is:
    Age   Name    Rating
0   25    Tom     4.23
1   26    James   3.24
2   25    Ricky   3.98
3   23    Vin     2.56
4   30    Steve   3.20
5   29    Smith   4.60
6   23    Jack    3.80
The first two rows of the data frame are:
   Age   Name   Rating
0  25    Tom    4.23
1  26    James  3.24

tail() Return the last n rows (observe the index value). The default number of elements displayed is5However, you can pass custom numbers.

 import pandas as pd
 import numpy as np
 # Create Series dictionary
 d = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack']),
    'Age':pd.Series([25,26,25,23,30,29,23]), 
    'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8])}
  
 # Create a DataFrame
 df = pd.DataFrame(d)
 print ("Our data frame is:")
 print df
 print ("The last two rows of the data frame are:")
 print(df.tail(2))

Running Result:

Our data frame is:
    Age   Name    Rating
0   25    Tom     4.23
1   26    James   3.24
2   25    Ricky   3.98
3   23    Vin     2.56
4   30    Steve   3.20
5   29    Smith   4.60
6   23    Jack    3.80
The last two rows of the data frame are:
    Age   Name    Rating
5   29    Smith    4.6
6   23    Jack     3.8