cProfile, timeit 간략 사용법 - 실전에제와 함께

s3ich4n·2023년 9월 12일

python tips_and_tricks

tips-and-tricks

목록 보기

2/2

tips and tricks #2 ― cProfile 과 timeit를 사용해본 간단한 예시.

‼️Spolier alert ahead‼️

이 게시글은 leetcode의 1647번 문제에 대한 풀이를 일부 포함하고 있습니다!

Prerequisites

파이썬 실행환경 (>=3.8을 권장합니다. 3.7은 EOL이 지났더군요!)
푼 문제: 1647. Minimum Deletions to Make Character Frequencies Unique
main 브랜치의 ps/daily/20230912 디렉토리

개요

오늘의 leetcode 문제를 풀 때, cProfile 과 timeit 을 사용하여 수행속도를 0.2초로 줄인 과정을 작성하려 합니다.

TL;DR

호출하려는 함수에 대해 timeit.timeit(함수, **kwargs) 로 수행시간을 얻을 수 있습니다.
호출하려는 함수에 대해 cProfile.run(함수) 를 수행하면, 함수 호출 내부의 프로파일링 결과를 얻을 수 있습니다.

상황

입력값인 s 값에 대해 처음엔 이렇게 처리했습니다:

def minDeletions(self, s: str) -> int:
    answer = 0
    s_count = {i: s.count(i) for i in s}
    
    (생략)

입력값이 작은 케이스에 대해서는 문제가 없었습니다만, 입력값이 98,691 바이트일 때는 Time Limit Exceeded 에러가 떴습니다.

이를 잡기 위해, 로직 혹은 다른 곳에 문제가 있는지 찾아보려 하였습니다.

`timeit`이란?

timeit는 파이썬에서 제공하는 작은 코드에 대한 속도 측정 라이브러리 입니다. 내부적으로는 Timer 인스턴스를 생성하고, 구동 결과를 리턴합니다(내부 코드도 의외로 충분히 읽을 만 하니, 어떤 식으로 실행되는지 읽어보시는 것을 권장드립니다! 코드 링크)

파이썬 공식문서에서 제공하는 예시를 파이썬 쉘로 구동해봅시다.

timeit.timeit(lambda: "-".join(map(str, range(100))), number=10000)
0.19665591977536678

그러면, timeit.timeit 메소드가 사용할 수 있는 키워드 아규먼트들을 살펴봅시다.

🍅 tips

사실 이 값은Timer 객체에 전달하는 값이고, 이를 읽어보시면 어떤 값인지 이해하기 좋습니다. 코드 링크

키워드명	설명
`stmt`, `setup`	수행할 구문 및 프로그램 수행에 필요한 값(`Callable` 도 됩니다!)
`timer`	동일 객체의 `default_timer()` 호출 결과입니다. 기본값은 time.perf_counter()
`number`	해당 구문을 몇번 호출할 지에 대한 값
`globals`	현재 실행하는 모듈의 네임스페이스를 지정해줄 수 있는 값

`cProfile`이란?

cProfile은 파이썬에서 제공하는 메모리 프로파일러입니다. 프로그램의 어떤 부분이 얼마나 자주, 오래 실행되는지 보고해줍니다(내부적으로는 pstats의 Stats 객체를 만들고 거기에 측정결과를 추가합니다. 필요하신 경우 살펴보세요! 코드 링크).

마찬가지로, 예시를 파이썬 쉘로 구동해봅시다.

import cProfile
import re
cProfile.run('re.compile("foo|bar")', 'restats')

... 입력해보면 아래와 같은 값이 나옵니다.

      214 function calls (207 primitive calls) in 0.002 seconds

Ordered by: cumulative time

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.002    0.002 {built-in method builtins.exec}
     1    0.000    0.000    0.001    0.001 <string>:1(<module>)
     1    0.000    0.000    0.001    0.001 __init__.py:250(compile)
     1    0.000    0.000    0.001    0.001 __init__.py:289(_compile)
     1    0.000    0.000    0.000    0.000 _compiler.py:759(compile)
     1    0.000    0.000    0.000    0.000 _parser.py:937(parse)
     1    0.000    0.000    0.000    0.000 _compiler.py:598(_code)
     1    0.000    0.000    0.000    0.000 _parser.py:435(_parse_sub)
     (생략)

첫 줄부터 볼까요.

function calls 는 결과를 리턴하기 위해 해당 함수 아래의 함수들을 214 번 호출하였음을 의미합니다. 이 중 207건은 primitive calls 로, 재귀를 통해 함수가 호출되지 않았음을 의미합니다.

각 필드의 의미는 아래와 같습니다:

컬럼명	설명
`ncalls`	함수를 몇 번 호출했는지
`tottime`	주어진 함수 수행의 전체 시간(하위 함수 호출에 소요된 시간 제외)
`percall`	`tottime`을 `ncalls`로 나눈 몫
`cumtime`	이 함수와 모든 하위 함수에 소요된 누적 시간(호출부터 종료까지). 재귀함수에도 됩니다!
`percall`	`cumtime`을 primitive calls 로 나눈 몫
`filename:lineno(function)`	어느 파일의 몇번 라인에 있는 함수를 수행했음을 의미

해결방법

정확히 얼마나 느린지 파악하기 위해 cProfile, timeit를 사용해보기로 했습니다.
속도가 현저히 느린 것을 파악했고, 왜 느린지 혹시 몰라 확인해보기로 했습니다.
문자열의 count() 함수가 원인임을 파악하였습니다.
이를 수정하여 속도 단축을 확인하였습니다.

사용방안 - `timeit`

이런 식으로 내용을 감싸서,

with open(pathlib.Path().cwd() / "testcase60", "r") as f:
    # 1. timeit으로 수행시간 테스트. 아래와 같이 실행함:
    execution_time = timeit.timeit(
        "s.minDeletions(f.readline())",
        globals=globals(),
        number=1,
    )
    print(f"quicksort() execution time: {execution_time}")

...구동을 수행하면 아래와 같이 결과가 나옵니다. 수행시간이 무려 8초 군요!

quicksort() execution time: 8.41342270001769

사용방안 - `cProfile`

이런 식으로 내용을 감싸서,

with open(pathlib.Path().cwd() / "testcase60", "r") as f:
    # 2. cProfile로 어떤 구문이 얼마만큼 실행되었는지 프로파일링
    #    stdout으로 실행내용이 나옴. 아래와 같이 실행함:
    cProfile.run("s.minDeletions(f.readline())")

...구동을 수행하면 아래와 같이 결과가 나옵니다. 수행시간 뿐 아니라, str 객체의 count 메소드가 95942번 호출되었고, 그 때문에 cumtime (cumulative time) 이 8.6초 가량 잡혔군요!

PS C:\udemy_study_compilation\ps\daily\20230912> python .\01-alt01.py
         95950 function calls in 8.662 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    8.661    8.661 01-alt01.py:11(minDeletions)
        1    0.020    0.020    8.661    8.661 01-alt01.py:13(<dictcomp>)
        1    0.000    0.000    8.662    8.662 <string>:1(<module>)
        1    0.000    0.000    8.662    8.662 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
    95942    8.641    0.000    8.641    0.000 {method 'count' of 'str' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'readline' of '_io.TextIOWrapper' objects}
        1    0.000    0.000    0.000    0.000 {method 'values' of 'dict' objects}

이 로직을 고쳐야겠습니다!

로직 고치기

매번 str.count() 를 수행하느니, 차라리 문자열을 읽어서 collections.defaultdict 에 하나씩 넣고 문제에서 요하는 검증로직을 태워봅시다. 코드를 고치고 다시 구동해봅시다.

import collections


class Solution:
    def minDeletions(self, s: str) -> int:
        answer = 0
        # 이렇게 하면 일일이 count를 세니까 느려진다! cProfile로 이걸 잡을 수 있었다.
        # s_count = {i: s.count(i) for i in s}

        s_count = collections.defaultdict(int)
        
        for i in s:
            s_count[i] += 1

...Voilà! 확 줄었네요!

PS C:udemy_study_compilation\ps\daily\20230912> python .\01-alt01.py
         7 function calls in 0.006 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.006    0.006    0.006    0.006 01-alt01.py:13(minDeletions)
        1    0.000    0.000    0.006    0.006 <string>:1(<module>)
        1    0.000    0.000    0.006    0.006 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'readline' of '_io.TextIOWrapper' objects}
        1    0.000    0.000    0.000    0.000 {method 'values' of 'dict' objects}

결과

배운 점은 아래와 같습니다.

str.count()를 통해 딕셔너리 컴프리헨션을 괜히 썼습니다. GPT에게 딕셔너리 컴프리헨션으로 딱 떨어지는 코드를 요구했더니, 매번 count() 를 호출하는 코드를 주더군요.
차라리 매 문자열을 읽고 collections.defaultdict 에 넣는 편이 나았습니다. $O(n)$ 만에 수행하니까요.
그래도 이런 케이스를 통해 cProfile, timeit 를 사용해봤습니다. 이 라이브러리들은 좀더 복잡하고 세밀한 컨트롤도 되니, 필요하면 그 때 다시 알아보도록 합시다.

s3ich4n

백엔드 프로그래머

이전 포스트

cProfile, timeit 간략 사용법 - 실전에제와 함께

tips-and-tricks

Prerequisites

개요

TL;DR

상황

`timeit`이란?

`cProfile`이란?

해결방법

사용방안 - `timeit`

사용방안 - `cProfile`

로직 고치기

결과

How to install poetry, pyenv on Windows 11?

0개의 댓글

cProfile, timeit 간략 사용법 - 실전에제와 함께

tips-and-tricks

Prerequisites

개요

TL;DR

상황

timeit이란?

cProfile이란?

해결방법

사용방안 - timeit

사용방안 - cProfile

로직 고치기

결과

How to install poetry, pyenv on Windows 11?

0개의 댓글

`timeit`이란?

`cProfile`이란?

사용방안 - `timeit`

사용방안 - `cProfile`