๐Ÿ SETTING ๊ฐœ๋ฐœํ™˜๊ฒฝ ์„ค์ •โ‘ข (vscode: ๋Œ€ํ™”ํ˜•/๋กœ์ปฌ bigquery)

Lightmanยท2022๋…„ 8์›” 11์ผ
3

CS: PYTHON๐Ÿ & โ“‡

๋ชฉ๋ก ๋ณด๊ธฐ
4/14

vs code์˜ conda ๊ฐ€์ƒํ™˜๊ฒฝ์— ๋Œ€ํ™”ํ˜• ํ™˜๊ฒฝ์ธ jupyter notebook์„ ์„ค์ •ํ•˜๊ณ  bigquery๋ฅผ ์—ฐ๋™ํ•˜์—ฌ ์ด์šฉํ•ด๋ณด์ž

โ‘ข ๋กœ์ปฌ vscode์™€ GCP ์—ฐ๋™

CLIENT AUTHORIZATION

CASE1: ๋Œ€ํ™”ํ˜• ํ™˜๊ฒฝjupyter notebook์˜ ๊ฒฝ์šฐ(๋กœ์ปฌ๊ณผ colab)

  • ์ค€๋น„๋ฌผ: GCP ํ”„๋กœ์ ํŠธ, ๋น…์ฟผ๋ฆฌ์™€ ์—ฐ๋™๋œ ๊ตฌ๊ธ€ ์•„์ด๋””

  • โžผ read_gbq๋ฅผ ์ด์šฉํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ๋ชจ๋“ˆpandas-gbq์„ ์„ค์น˜ํ•œ๋‹ค

    pip3 install --upgrade --force-reinstall pandas-gbq
  • โžผ python ๋Œ€ํ™”ํ˜• ํ™˜๊ฒฝ์—์„œ read_gbq ํ•จ์ˆ˜๋ฅผ ์‹คํ–‰ํ•˜๊ณ  ์ถœ๋ ฅ๋˜๋Š” ๋งํฌ์— ์ ‘์†ํ•˜์—ฌ ๊ตฌ๊ธ€ ์•„์ด๋””๋ฅผ ์ธ์ฆ ๋ฐ ์ฟผ๋ฆฌ๋ฅผ ๋ถˆ๋Ÿฌ์˜จ๋‹ค

    import pandas as pd
    df = pd.read_gbq( query = query , project_id = 'project_id', dialect = 'standard' )
  • ( ์•„๋ž˜ ๋ฐฉ๋ฒ•์œผ๋กœ๋„ ๊ฐ€๋Šฅํ•˜๋‹ค )

    from google.cloud import bigquery
    #from google.oauth2 import service_account
    
    #GCP ํ”„๋กœ์ ํŠธ ์ž…๋ ฅ
    project_id = 'PROJECT_ID'
    
    #authorization
    client = bigquery.Client(project = project_id)

CASE2: ๋กœ์ปฌ Python์˜ ๊ฒฝ์šฐ

  • ์ค€๋น„๋ฌผ

  • โžผ google-cloud-bigquery, google-cloud-bigquery-storage package ์„ค์น˜

    pip3 install --upgrade --force-reinstall google-cloud-bigquery google-cloud-bigquery-storage
    #conda install -c conda-forge google-cloud-bigquery
  • โžผ JSON์— ์ €์žฅ๋œ KEY๋ฅผ ๋ถˆ๋Ÿฌ์™€์„œ client๋กœ ์—ฐ๊ฒฐํ•œ๋‹ค

    from google.cloud import bigquery
    #from google.oauth2 import service_account
    
    #Local์— ์ €์žฅ๋œ SERVICE_ACCOUNT_KEY JSON ํŒŒ์ผ ๊ฒฝ๋กœ ์ž…๋ ฅ
    credentials = service_account.Credentials.from_service_account_file(r'SERVICE_ACCOUNT_KEY.JSON')
    
    #GCP ํ”„๋กœ์ ํŠธ ์ž…๋ ฅ
    project_id = 'PROJECT_ID'
    
    #authorization
    client = bigquery.Client(credentials = credentials, project = project_id)

CLIENT ์—ฐ๊ฒฐ ํ›„

QUERY

#QUERY
query_job = client.query("""BGQUERY""")
df = query_job.to_dataframe()

LOAD

#LOAD
PROJECT_ID = "project_id"
DATASET = "dataset"
TABLE_NAME = "table_name" 
TABLE_ID = f"{PROJECT_ID}.{DATASET}.{TABLE_NAME}"

job_config = bigquery.LoadJobConfig(
	schema=[
    	bigquery.SchemaField("f1", bigquery.enums.SqlTypeNames.STRING),
        bigquery.SchemaField("f2", bigquery.enums.SqlTypeNames.DATE),
        bigquery.SchemaField("f3", bigquery.enums.SqlTypeNames.FLOAT)],
	write_disposition="WRITE_APPEND" | "WRITE_TRUNCATE"
    )
# Make an API request
job = client.load_table_from_dataframe(
	dataframe, table_id, job_config = job_config)
# Activate Job
job.result()

SUMMARY: ๋Œ€ํ™”ํ˜• ํ™˜๊ฒฝ์˜ ๊ฒฝ์šฐ

from google.cloud import bigquery
#from google.oauth2 import service_account

#CASE1: ๋Œ€ํ™”ํ˜• ํ™˜๊ฒฝ
#GCP ํ”„๋กœ์ ํŠธ ์ž…๋ ฅ
project_id = 'PROJECT_ID'
#authorization
client = bigquery.Client(project = project_id)

##CASE2: ๋กœ์ปฌ ํ™˜๊ฒฝ
##Local์— ์ €์žฅ๋œ SERVICE_ACCOUNT_KEY JSON ํŒŒ์ผ ๊ฒฝ๋กœ ์ž…๋ ฅ
#credentials = service_account.Credentials.from_service_account_file(r'SERVICE_ACCOUNT_KEY.JSON')
##GCP ํ”„๋กœ์ ํŠธ ์ž…๋ ฅ
#project_id = 'PROJECT_ID'
##authorization
#client = bigquery.Client(credentials = credentials, project = project_id)
  
#QUERY
query_job = client.query("""BGQUERY""")
df = query_job.to_dataframe()

#LOAD
PROJECT_ID = "project_id"
DATASET = "dataset"
TABLE_NAME = "table_name" 
TABLE_ID = f"{PROJECT_ID}.{DATASET}.{TABLE_NAME}"

job_config = bigquery.LoadJobConfig(
	schema=[
    	bigquery.SchemaField("f1", bigquery.enums.SqlTypeNames.STRING),
        bigquery.SchemaField("f2", bigquery.enums.SqlTypeNames.DATE),
        bigquery.SchemaField("f3", bigquery.enums.SqlTypeNames.FLOAT)],
	write_disposition="WRITE_APPEND" | "WRITE_TRUNCATE"
    )
# Make an API request
job = client.load_table_from_dataframe(
	dataframe, table_id, job_config = job_config)
# Activate Job
job.result()

LESSON LEARNED

  • ํŒจํ‚ค์ง€ ๊ฒ€์ƒ‰์€ pip3์™€ pypi
  • ํŒจํ‚ค์ง€ ์„ค์น˜๊ฐ€ ์•ˆ๋˜๋ฉด ํŒŒ์ด์ฌ, ์•„๋‚˜์ฝ˜๋‹ค, ๋ชจ๋“ˆ์˜ ๋ฒ„์ „์„ ์‚ดํŽด๋ณด์ž
python --version
conda --version
conda update

๋.

profile
ํ˜„์ง ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€ / ๋ฐ์ดํ„ฐ ๊ณผํ•™์˜ ์ •๋„๋ฅผ ๋”ฐ๋ผ ๐Ÿšฒ / About DEV DA ML

0๊ฐœ์˜ ๋Œ“๊ธ€