Comparison Operators as Ufuncs
# In[1]
x=np.array([1,2,3,4,5])
print('x < 3 :',x<3)
print('x > 3 :',x>3)
print('x <= 3 :',x<=3)
print('x >= 3 :',x>=3)
print('x != 3 :',x!=3)
print('x == 3 :',x==3)
# Out[1]
x < 3 : [ True True False False False]
x > 3 : [False False False True True]
x <= 3 : [ True True True False False]
x >= 3 : [False False True True True]
x != 3 : [ True True False True True]
x == 3 : [False False True False False]
- It is also possible to do an element-wise comparison of two arrays
# In[2]
(2**x)==(x**2)
# Out[2]
array([False, True, False, False, False])
- As in the case of arithmetic operators, the comparison operators are implemented as ufuncs in Numpy.
Operator | Equivalent ufunc |
---|
== | np.equal |
< | np.less |
> | np.greater |
!= | np.not_equal |
<= | np.less_equal |
>= | np.greater_equal |
- These will work on arrays of any size and shape.
# In[3]
rng=np.random.default_rng(seed=1701)
x=rng.integers(10,size=(3,4))
x
# Out[3]
array([[9, 4, 0, 3],
[8, 6, 3, 1],
[3, 7, 4, 0]])
# In[4]
x<6
# Out[4]
array([[False, True, True, True],
[False, False, True, True],
[ True, False, True, True]])
Working with Boolean Arrays
Counting entries
- To count the number of
True
entries in a Boolean array, np.count_nonzero
is useful
# In[5]
np.count_nonzero(x<6)
# Out[5]
8
- Another way to get at this information is to use
np.sum
; in this case, False
is interpreted as 0, and True
is interpreted as 1
# In[6]
np.sum(x<6)
# Out[6]
8
- This summation can be done along rows and columns using
axis
# In[7]
np.sum(x<6,axis=1)
# Out[7]
array([3,2,3])
- If we interested in quickly checking whether any or all the values are
True
, we can use np.any
or np.all
# In[8]
np.any(x<8) # are they any values greater than 8?
# Out[8]
True
# In[9]
np.all(x<10) # are all values less than 10?
# Out[9]
True
np.all
and np.any
can be used along particular axes.
# In[10]
np.all(x<8,axis=1)
# Out[10]
array([False,False, True])
Boolean Operators
Operator | Equivalent ufunc |
---|
& | np.bitwise_and |
^ | np.bitwise_xor |
'vertical bar' | np.bitwise_or |
~ | np.bitwise_not |
Boolean Arrays as Masks
# In[11]
x
# Out[11]
array([[9, 4, 0, 3],
[8, 6, 3, 1],
[3, 7, 4, 0]])
# In[12]
x<5
# Out[12]
array([[False, True, True, True],
[False, False, True, True],
[ True, False, True, True]])
# In[13]
x[x<5]
# Out[13]
array([4, 0, 3, 3, 1, 3, 4, 0])
- We can select these values from the array, we can simply index on this Boolean array; this is known as a masking operation.
Using the keywords and/or Versus the Operators &/|
and
and or
operate on the object as a whole, while &
and |
operate on the elements within the object.
- When you use
and
and or
, it is equivalent to asking Python to treat the object as a single Boolean entity.
- In Python, all nonzero integers will evaluate as
True
# In[14]
bool(42), bool(0)
# Out[14]
(True, False)
# In[15]
print(bool(42 and 0))
print(bool(42 or 0))
# Out[15]
False
True
- When you use
&
and |
on integers, the expression operates on the bitwise representation of the element, applying the and
and the or
to the individual bits making up the number.
# In[16]
bin(42)
# Out[16]
'0b101010'
# In[17]
bin(59)
# Out[17]
'0b111011'
# In[18]
print(bin(42 & 59))
print(bin(42 | 59))
# Out[18]
'0b101010'
'0b111011'
- When you have an array of Boolean values in Numpy, this can be thought of as a string of bits where 1=
True
and 0=False
, and &
and |
will operate similarly to in the preceding examples.
# In[19]
A=np.array([1,0,1,0,1,0],dtype=bool)
B=np.array([1,1,1,0,1,1],dtype=bool)
A|B
# Out[19]
array([ True, True, True, False, True, True])
- But if you use
or
on these arrays it will try to evaluate the truth or falsehood of the entire array object, which is not a well-defined value
- Similarly, when evaluating a Boolean expression on a given array, you should use
|
or &
rather than or
or and
# In[20]
x=np.arange(10)
(x>4)&(x<8)
# Out[20]
array([False, False, False, False, False, True, True, True, False, False])
- To summarize,
and
and or
perform a single Boolean evaluation on an entire object, while &
and |
perform multiple Boolean evaluations on the content(the individual bits or bytes) of an object. For Boolean Numpy arrays, the latter is nearly always the desired operation.