데이터마이닝 오류2

lakebear·2023년 4월 5일

문제: Calories’ 변수에 대한 요약 통 계(평균, 표준편차, 최소값, 최대값, 중앙값)를 도출

import numpy as np
import pandas as pd
from sklearn.decomposition import PCA
from sklearn import preprocessing
import matplotlib.pylab as plt
import dmba

error
no display found. Using non-interactive Agg backend

해결방안: pip3 install --upgrade impedance

pd.DataFrame({'mean':cereals_df.calories.mean(),
    'sd':cereals_df.calories.std(),
    'min':cereals_df.calories.min(),
    'max':cereals_df.calories.max(),
    'median':cereals_df.calories.median(),          
})

error
ValueError: If using all scalar values, you must pass an index

해결방안:
기본적으로 DataFrame 에 들어갈 값은 df = pd.DataFrame({'col_1':[1,2,3,4], 'col_2':[1,2,3,4]})이런 형식으로 들어가야 한다.

그래서 해결 방법은 4가지 방법이 존재한다.
1.index 값 추가
2.값을 리스트로 변환
3.pd.DataFrame.from_records() 사용
4.pd.DataFrame.from_dict([]) 사용

1.index 값 추가

df = pd.DataFrame({'col_1':1, 'col_2':2}, index=[0])

2.값을 리스트로 변환

df = pd.DataFrmae({'col_1':[1], 'col_2': [2]})
pd.DataFrame.from_records() 사용

3.df = pd.DataFrame.from_records([{'col_1': 1, 'col_2': 2}])

4.pd.DataFrame.from_dict([]) 사용

df = pd.DataFrame.from_dict([{'col_1': 1, 'col_2': 2}])

-> 고침
pd.DataFrame({'mean':cereals_df['calories'].mean(),
'sd':cereals_df['calories'].std(),
'min':cereals_df['calories'].min(),
'max':cereals_df['calories'].max(),
'median':cereals_df['calories'].median(),
})

error
ValueError: If using all scalar values, you must pass an index

pd.DataFrame({'mean':cereals_df['calories'].mean(),
    'sd':cereals_df['calories'].std(),
    'min':cereals_df['calories'].min(),
    'max':cereals_df['calories'].max(),
    'median':cereals_df['calories'].median(),          
}, index = ['calories'])

UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.

import matplotlib
matplotlib.use('TkAgg')

import matplotlib
matplotlib.use('TkAgg')

/var/folders/vh/1j3h_hj126v1k89m5yqs5r6c0000gn/T/ipykernel_21019/301647849.py:2: UserWarning: FixedFormatter should only be used together with FixedLocator
ax.set_yticklabels(['{:,.0%}'.format(x) for x in ax.get_yticks()])

https://lakedata.tistory.com 블로그 이전

이전 포스트

데마_2주차 프로세스

다음 포스트

데이터마이닝 오류2

데마_2주차 프로세스

데이터마이닝 오류2

0개의 댓글