Arrival Delay in Minutes 컬럼이 결측치인 데이터들 중 ‘neutral or dissatisfied’ 보다 ‘satisfied’의 수가 더 높은 Class는 어디 인가?
# s
df_isnull = df[df['Arrival Delay in Minutes'].isnull()].groupby(['Class','satisfaction'])['satisfaction'].size().sort_values(0).to_frame('cnt')
df_isnull = df_isnull.pivot_table(index='Class', columns='satisfaction', values='cnt')
df_isnull[df_isnull['satisfied'] > a['neutral or dissatisfied']]
# df_isnull = df[df['Arrival Delay in Minutes'].isnull()].groupby(['Class','satisfaction'])['satisfaction'].size().sort_values(0).to_frame('cnt').pivot_table(index='Class', columns='satisfaction', values='cnt')
# result
satisfaction neutral or dissatisfied satisfied
Class
Business 36 76
answer =df.loc[df['Arrival Delay in Minutes'].isnull()].groupby(['Class','satisfaction'],as_index=False).size().pivot(index='Class',columns='satisfaction')
result =answer[answer['size']['neutral or dissatisfied'] < answer['size']['satisfied']]
result