English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
Pandas Notes and Traps
When you use boolean operators if or when, or or or not, and try to convert some content to bool, an error may occur. How the error occurs is currently unclear. Pandas raises a ValueError exception.
import pandas as pd if pd.Series([False, True, False]): print 'I am True'
The results of the execution are as follows:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any(), or a.all().
In this case, it is not clear how to handle it. This error suggests that it is using None or any of them.
import pandas as pd if pd.Series([False, True, False]).any(): print("I am any")
The results of the execution are as follows:
I am any
To evaluate a single-element Pandas object in a boolean context, use the .bool() method-
import pandas as pd print pd.Series([True]).bool()
The results of the execution are as follows:
True
Bitwise boolean operators such as == and ! will return a boolean series, which is almost always necessary.
import pandas as pd s = pd.Series(range(5)) print s==4
The results of the execution are as follows:
0 False 1 False 2 False 3 False 4 True dtype: bool
This will return a boolean series showing whether each element in the boolean value is completely contained in the passed value sequence.
import pandas as pd s = pd.Series(list('abc')) s = s.isin(['a', 'c', 'e']) print s
The results of the execution are as follows:
0 True 1 False 2 True dtype: bool
Many users find that they use the ix index function as a concise method for selecting data from Pandas objects:
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(6, 4), columns=['one', 'two', 'three', 'four'], index=list('abcdef')) print df print df.ix[['b', 'c', 'e']]
The results of the execution are as follows:
one two three four a -1.582025 1.335773 0.961417 -1.272084 b 1.461512 0.111372 -0.072225 0.553058 c -1.240671 0.762185 1.511936 -0.630920 d -2.380648 -0.029981 0.196489 0.531714 e 1.846746 0.148149 0.275398 -0.244559 f -1.842662 -0.933195 2.303949 0.677641 one two three four b 1.461512 0.111372 -0.072225 0.553058 c -1.240671 0.762185 1.511936 -0.630920 e 1.846746 0.148149 0.275398 -0.244559
Of course, in this case, this is completely equivalent to using the reindex method:
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(6, 4), columns=['one', 'two', 'three', 'four'], index=list('abcdef')) print df print df.reindex(['b', 'c', 'e'])
The results of the execution are as follows:
one two three four a 1.639081 1.369838 0.261287 -1.662003 b -0.173359 0.242447 -0.494384 0.346882 c -0.106411 0.623568 0.282401 -0.916361 d -1.078791 -0.612607 -0.897289 -1.146893 e 0.465215 1.552873 -1.841959 0.329404 f 0.966022 -0.190077 1.324247 0.678064 one two three four b -0.173359 0.242447 -0.494384 0.346882 c -0.106411 0.623568 0.282401 -0.916361 e 0.465215 1.552873 -1.841959 0.329404
Someone might conclude that ix and reindex are based on this100% equivalent. This is the case except for integer indexing. For example, the above operation can be equivalently expressed as:
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(6, 4), columns=['one', 'two', 'three', 'four'], index=list('abcdef')) print df print df.ix[[1, 2, 4]) print df.reindex([1, 2, 4])
The results of the execution are as follows:
one two three four a -1.015695 -0.553847 1.106235 -0.784460 b -0.527398 -0.518198 -0.710546 -0.512036 c -0.842803 -1.050374 0.787146 0.205147 d -1.238016 -0.749554 -0.547470 -0.029045 e -0.056788 1.063999 -0.767220 0.212476 f 1.139714 0.036159 0.201912 0.710119 one two three four b -0.527398 -0.518198 -0.710546 -0.512036 c -0.842803 -1.050374 0.787146 0.205147 e -0.056788 1.063999 -0.767220 0.212476 one two three four 1 NaN NaN NaN NaN 2 NaN NaN NaN NaN 4 NaN NaN NaN NaN
It is important to remember that reindexing is strictly a label-based indexing. In cases where the index contains such values as integers and strings, this may lead to some unexpected results.