Data I/O with Python

concept·2022년 6월 21일

Data handling

목록 보기

3/3

I/O functions

pd.read_csv(path)

테이블 형태 데이터 불러오는 경우 사용
sep : 구분자 (default : ',')
header : 헤더 위치로 None 입력시 컬럼명이 0, 1, 2..로 자동 부여됨(default : 'infer')
index_col : 인덱스의 위치 (default : None)
usecols : 사용할 컬럼 목록 및 위치 목록(데이터가 큰 경우)
nrows : 불러올 행 개수(데이터가 큰 경우)
경로를 설정함으로써 파일을 불러올 때마다 경로를 포함시키는 번거로움 해결
- os.getcwd() : 현재 경로를 반환
- os.chdir(path) : 현재 경로를 path로 설정

pd.to_csv(path, sep, index)

테이블 형태의 데이터 저장
sep : 구분자(default : ',')
index : 인덱스를 저장할지 여부

open(path, mode)

파이썬 내장함수 open()은 파일 객체를 생성한다.
정제되지 않은 형태의 데이터를 불러오는 경우 주로 사용
mode
- r(read)
- w(write, 기존파일이 있으면 덮어쓰기)
- a(append, 기존파일에 새로운 내용 추가)
파일 객체는 사용 후에 close()로 닫아줘야 함. w/ with

with open('file.txt') as file_data:
    print(file_data.readline(), end="")
    # with문을 나올 때 close를 자동으로 불러준다.
    # f.close()

read(), readline()

f.read() : 파일 f에 있는 모든 내용을 불러옴
f.readline() : 파일 f에 있는 한 줄(\n 기준 및 포함)을 불러옴
- f는 반드시 r 이나 rb 모드로 불러와야 한다.
- read(), readline()의 결과물은 모두 문자열이기 때문에 문자열 관련 함수들을 숙지해야 한다.

str.split(sep) : sep을 기준으로 str을 분할한 리스트     
map(func, L) : iterable 객체 L에 함수를 일괄 적용

write()

f.write(str) : str 내용을 파일 f에 저장
f는 반드시 w 나 a 모드로 불러와야 한다.
- 리스트 등을 string으로 변환하는 join() 함수를 활용하면 효율적으로 파일을 쓸 수 있다.

# list의 문자열 요소들을 sep으로 연결해 하나의 문장 반환
sep.join(list)

pd.read_excel(path, sheet_name, header, index_col, usecols, parse_dates, nrows[, skiprows])

.xlsx 포맷 데이터를 불러오는 경우
sheet_name : 불러올 시트명 혹은 위치
header : 헤더 위치(None 입력시 컬럼명이 0, 1, 2..로 자동 부여됨. default = 'infer')
index_col : 인덱스 위치(default : None)
usecols : 사용할 컬럼 목록 및 위치 목록(데이터가 큰 경우)
nrows : 불러올 행의 개수(데이터가 큰 경우)
skiprows : 불러오지 않을 행의 위치(리스트)

pd.to_excel(path, index, sheet_name, mode)

테이블 형태의 데이터를 저장
index : 인덱스 저장 여부

pd.ExcelWriter(path)

여러 시트를 한 엑셀파일에 생성하는 경우

writer = pd.ExcelWriter(path)
df1.to_excel(writer, sheet_name='Sheet1')
df2.to_excel(writer, sheet_name='Sheet2')

[source]
https://openclassrooms.com/en/courses/6902811-learn-python-basics/7091381-load-data-with-python

concept

이전 포스트

Data I/O with Python

Data handling

I/O functions

pd.read_csv(path)

pd.to_csv(path, sep, index)

open(path, mode)

read(), readline()

write()

pd.read_excel(path, sheet_name, header, index_col, usecols, parse_dates, nrows[, skiprows])

pd.to_excel(path, index, sheet_name, mode)

pd.ExcelWriter(path)

Data Structure

0개의 댓글

Data I/O with Python

Data handling

I/O functions

pd.read_csv(path)

pd.to_csv(path, sep, index)

open(path, mode)

read(), readline()

write()

pd.read_excel(path, sheet_name, header, index_col, usecols, parse_dates, nrows[, skiprows])

pd.to_excel(path, index, sheet_name, mode)

pd.ExcelWriter(path)

Data Structure

0개의 댓글

Data Structure