# In[1]
rng=np.random.default_rng(42)
ser=pd.Series(rng.integers(0,10,4))
ser
# Out[1]
0 0
1 7
2 6
3 4
dtype: int64
# In[2]
df=pd.DataFrame(rng.integers(0,10,(3,4)),columns=['A','B','C','D'])
df
# Out[2]
A B C D
0 4 8 0 6
1 2 0 5 9
2 7 7 7 7
# In[3]
np.exp(ser)
# Out[3]
0 1.000000
1 1096.633158
2 403.428793
3 54.598150
dtype: float64
# In[4]
np.sin(df*np.pi/4)
# Out[4]
A B C D
0 1.224647e-16 -2.449294e-16 0.000000 -1.000000
1 1.000000e+00 0.000000e+00 -0.707107 0.707107
2 -7.071068e-01 -7.071068e-01 -0.707107 -0.707107
# In[5]
area=pd.Series({'Alaska':172337,'Texas':695662,'California':423967},name='area')
population=pd.Series({'California':39538223,'Texas':29145505,'Florida':21538187},name='population')
population/area
# Out[5]
Alaska NaN
California 93.257784
Florida NaN
Texas 41.896072
dtype: float64
# In[6]
area.index.union(population.index)
# Out[6]
Index(['Alaska', 'California', 'Florida', 'Texas'], dtype='object')
NaN
, which is how Pandas marks missing data.NaN
# In[7]
A=pd.Series([2,4,6],index=[0,1,2])
B=pd.Series([1,3,5],index=[1,2,3])
A+B
# Out[7]
0 NaN
1 5.0
2 9.0
3 NaN
dtype: float64
NaN
values is not the desired behavior, the fill_value
can be modified using appropriate object methods in place of the operators.# In[8]
A.add(B,fill_value=0)
# Out[8]
0 2.0
1 5.0
2 9.0
3 5.0
dtype: float64
# In[9]
A=pd.DataFrame(rng.integers(0,20,(2,2)),columns=['a','b'])
A
# Out[9]
a b
0 10 2
1 16 9
# In[10]
B=pd.DataFrame(rng.integers(0,10,(3,3)),columns=['b','a','c'])
B
# Out[10]
b a c
0 5 3 1
1 9 7 6
2 4 8 5
# In[11]
A+B
# Out[11]
a b c
0 13.0 7.0 NaN
1 23.0 18.0 NaN
2 NaN NaN NaN
fill_value
to be used in place of missing entries.# In[12]
A.add(B,fill_value=A.values.mean()) # A.values.mean()=9.25
# Out[12]
a b c
0 13.00 7.00 10.25
1 23.00 18.00 15.25
2 17.25 13.25 14.25
Mapping between Python operators and Pandas methods
Python operator | Pandas methods |
---|---|
+ | add |
- | sub,subtract |
* | mul,multiply |
/ | truediv,div,divide |
// | floordiv |
% | mod |
** | pow |
# In[13]
A=rng.integers(10,size=(3,4))
A
# Out[13]
array([[5, 4, 4, 2],
[0, 5, 8, 0],
[8, 8, 2, 6]])
# In[14]
A-A[0]
# Out[14]
array([[ 0, 0, 0, 0],
[-5, 1, 4, -2],
[ 3, 4, -2, 4]])
# In[15]
df=pd.DataFrame(A,columns=['Q','R','S','T'])
df-df.iloc[0]
# Out[15]
Q R S T
0 0 0 0 0
1 -5 1 4 -2
2 3 4 -2 4
axis
# In[16]
df.subtract(df['R'],axis=0)
# Out[16]
Q R S T
0 1 0 0 -2
1 -5 0 3 -5
2 0 0 -6 -2
# In[17]
halfrow=df.iloc[0,::2]
halfrow
# Out[17]
Q 5
S 4
Name: 0, dtype: int64
# In[18]
df-halfrow
# Out[18]
Q R S T
0 0.0 NaN 0.0 NaN
1 -2.0 NaN 5.0 NaN
2 3.0 NaN 7.0 NaN
아주 유익한 내용이네요!