English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

Iteration in Pandas

The behavior of basic iteration (for) on Pandas objects depends on the type. When iterating over a Series, it is equivalent to an array. Other data structures (such as DataFrame and Panel) follow a syntax similar to dict, that is, iterating over the keys of the object.

In short, basic iteration (for i in object) produces −

Series − Value DataFrame − Column label Panel − Item label

DataFrame Iteration

Iterating over a DataFrame gives column names. Let's see the following example.

 import pandas as pd
 import numpy as np
  
 N=20
 df = pd.DataFrame({
    'A': pd.date_range(start='2016-01-01', periods=N, freq='D'),
    'x': np.linspace(0, stop=N-1, num=N),
    'y': np.random.rand(N),
    'C': np.random.choice(['Low', 'Medium', 'High'], N).tolist(),
    'D': np.random.normal(100, 10, size=(N)).tolist()
    )
 for col in df:
    print col

The output is as follows

   A
 C
 D
 x
 y

To iterate over the rows of a DataFrame, we can use the following functions-

iteritems() − Iterate over (key, value) pairs iterrows() − Iterate over rows in the form of (index, series) pairs itertuples() − Iterate over rows in the form of namedtuples

iteritems()

Iterate over each column as a key, and take the labeled value pairs as keys, and take the column values as Series objects.

  import pandas as pd
  import numpy as np
  
  
  
 df = pd.
  DataFrame(np.
  random.randn(4,3), columns=[
  'col1',
  'col2',
  'col3'])
  
 
  for key, value
   in df.
  iteritems():
  
    print key, value

Running Result:

col1 0       0.802390
1    0.324060
2    0.256811
3    0.839186
Name: col1, dtype: float64
col2 0    1.624313
1   -1.033582
2    1.796663
3    1.856277
Name: col2, dtype: float64
col3 0   -0.022142
1   -0.230820
2    1.160691
3   -0.830279
Name: col3, dtype: float64

It can be seen that each column is iterated as a key-value pair in the series.

iterrows()

iterrows() returns an iterator that produces each index value and a sequence containing each row of data.

 import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(4,3), columns = ['col1','col2','col3'])
 for row_index, row in df.iterrows():
    print row_index, row

Running Result:

0    col1    1.529759
   col2    0.762811
   col3   -0.634691
Name: 0, dtype: float64
1  col1   -0.944087
   col2    1.420919
   col3   -0.507895
Name: 1, dtype: float64
 
2  col1   -0.077287
   col2   -0.858556
   col3   -0.663385
Name: 2, dtype: float64
3  col1    -1.638578
   col2     0.059866
   col3     0.493482
Name: 3, dtype: float64

Since iterrows() traverses rows, the data types in the row will not be preserved. 0,1,2is the row index, col1, col2, col3is the column index.

itertuples()

The itertuples() method returns an iterator that generates a named tuple for each row in the DataFrame. The first element of the tuple will be the corresponding index value of the row, and the rest will be the row values.

 import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(4,3), columns = ['col1','col2','col3'])
 for row in df.itertuples():
     print row

Running Result:

Pandas(Index=0, col1=1.5297586201375899, col2=0.76281127433814944, col3=-
0.6346908238310438)
Pandas(Index=1, col1=-0.94408735763808649, col2=1.4209186418359423, col3=-
0.50789517967096232)
Pandas(Index=2, col1=-0.07728664756791935, col2=-0.85855574139699076, col3=-
0.6633852507207626)
Pandas(Index=3, col1=0.65734942534106289, col2=-0.95057710432604969,
col3=0.80344487462316527)
Note:Do not attempt to modify any objects while iterating. Iteration is used for reading; the iterator returns a copy of the original object (view), so changes will not be reflected in the original object.
 import pandas as pd
 import numpy as np
 df = pd.DataFrame(np.random.randn(4,3), columns = ['col1','col2','col3'])
 for index, row in df.iterrows():
    row['a'] = 10
 print df

Running Result:

    col1       col2       col3
0  -1.739815   0.735595  -0.295589
1   0.635485   0.106803   1.527922
2  -0.939064   0.547095   0.038585
3  -1.016509  -0.116580  -0.523158

Observe, no changes were reflected.