SQL Operations in Pandas Reindexing in Pandas

Iteration in Pandas

The behavior of basic iteration (for) on Pandas objects depends on the type. When iterating over a Series, it is equivalent to an array. Other data structures (such as DataFrame and Panel) follow a syntax similar to dict, that is, iterating over the keys of the object.

In short, basic iteration (for i in object) produces −

Series − Value DataFrame − Column label Panel − Item label

DataFrame Iteration

Iterating over a DataFrame gives column names. Let's see the following example.

Example

　import pandas as pd
　import numpy as np
　　
　N=20
　df = pd.DataFrame({
　　　　'A': pd.date_range(start='2016-01-01', periods=N, freq='D'),
　　　　'x': np.linspace(0, stop=N-1, num=N),
　　　　'y': np.random.rand(N),
　　　　'C': np.random.choice(['Low', 'Medium', 'High'], N).tolist(),
　　　　'D': np.random.normal(100,　10, size=(N)).tolist()
　　　　)
　for col in df:
　　　　print col

The output is as follows

　　　A
　C
　D
　x
　y

To iterate over the rows of a DataFrame, we can use the following functions-

iteritems() − Iterate over (key, value) pairs iterrows() − Iterate over rows in the form of (index, series) pairs itertuples() − Iterate over rows in the form of namedtuples

iteritems()

Iterate over each column as a key, and take the labeled value pairs as keys, and take the column values as Series objects.

Example

　　import pandas as pd
　　import numpy as np
　　
　　
　　
　df = pd.
　　DataFrame(np.
　　random.randn(4,3), columns=[
　　'col1',
　　'col2',
　　'col3'])
　　
　
　　for key, value
　　　in df.
　　iteritems():
　　
　　　　print key, value

Running Result:

col1　0       0.802390
1　　　　0.324060
2　　　　0.256811
3　　　　0.839186
Name: col1, dtype: float64
col2　0　　　　1.624313
1　　　-1.033582
2　　　　1.796663
3　　　　1.856277
Name: col2, dtype: float64
col3　0　　　-0.022142
1　　　-0.230820
2　　　　1.160691
3　　　-0.830279
Name: col3, dtype: float64

It can be seen that each column is iterated as a key-value pair in the series.

iterrows()

iterrows() returns an iterator that produces each index value and a sequence containing each row of data.

Example

　import pandas as pd
　import numpy as np
　df = pd.DataFrame(np.random.randn(4,3), columns = ['col1','col2','col3'])
　for row_index, row in df.iterrows():
　　　　print row_index, row

Running Result:

0    col1　　　　1.529759
　　　col2　　　　0.762811
　　　col3　　　-0.634691
Name: 0, dtype: float64
1　　col1　　　-0.944087
　　　col2　　　　1.420919
　　　col3　　　-0.507895
Name:　1, dtype: float64
　
2　　col1　　　-0.077287
　　　col2　　　-0.858556
　　　col3　　　-0.663385
Name:　2, dtype: float64
3　　col1　　　　-1.638578
　　　col2　　　　　0.059866
　　　col3　　　　　0.493482
Name:　3, dtype: float64

Since iterrows() traverses rows, the data types in the row will not be preserved. 0,1,2is the row index, col1, col2, col3is the column index.

itertuples()

The itertuples() method returns an iterator that generates a named tuple for each row in the DataFrame. The first element of the tuple will be the corresponding index value of the row, and the rest will be the row values.

Example

　import pandas as pd
　import numpy as np
　df = pd.DataFrame(np.random.randn(4,3), columns = ['col1','col2','col3'])
　for row in df.itertuples():
　　　　　print row

Running Result:

Pandas(Index=0, col1=1.5297586201375899, col2=0.76281127433814944, col3=-
0.6346908238310438)
Pandas(Index=1, col1=-0.94408735763808649, col2=1.4209186418359423, col3=-
0.50789517967096232)
Pandas(Index=2, col1=-0.07728664756791935, col2=-0.85855574139699076, col3=-
0.6633852507207626)
Pandas(Index=3, col1=0.65734942534106289, col2=-0.95057710432604969,
col3=0.80344487462316527)

Note:Do not attempt to modify any objects while iterating. Iteration is used for reading; the iterator returns a copy of the original object (view), so changes will not be reflected in the original object.

Example

　import pandas as pd
　import numpy as np
　df = pd.DataFrame(np.random.randn(4,3), columns = ['col1','col2','col3'])
　for index, row in df.iterrows():
　　　　row['a'] =　10
　print df

Running Result:

　　　　col1　　　　　　　col2　　　　　　　col3
0　　-1.739815　　　0.735595　　-0.295589
1　　　0.635485　　　0.106803　　　1.527922
2　　-0.939064　　　0.547095　　　0.038585
3　　-1.016509　　-0.116580　　-0.523158

Observe, no changes were reflected.

SQL Operations in Pandas Reindexing in Pandas

Pandas Tutorial

Iteration in Pandas

DataFrame Iteration

iteritems()

iterrows()

itertuples()