constants

yeoni·2023년 6월 27일

Tensorflow

목록 보기

1/15

1. Tensor

Deeplearning framework는 기본적으로 Tensor를 다루는 도구다.
- Tensor
Tensor를 다룰 때 가장 중요한 것!

${\rightarrow}$ SHAPE !!!

2. Tensor 생성

항상 체크

shape
dtype (데이터 타입이 같아야 연산이 가능)

Constant (상수)

tf.constant()
- list -> Tensor
- tuple -> Tensor
- Array -> Tensor

import tensorflow as tf
import numpy as np

li_ten = tf.constant([1, 2, 3])
# <tf.Tensor: shape=(3,), dtype=int32, numpy=array([1, 2, 3], dtype=int32)>

li_ten_f = tf.constant([1., 2., 3.])
# <tf.Tensor: shape=(3,), dtype=float32, numpy=array([1., 2., 3.], dtype=float32)>

tu_ten = tf.constant(((1, 2, 3), (1, 2, 3)), name="sample")
'''
<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [1, 2, 3]], dtype=int32)>
'''

arr = np.array([1., 2., 3.])
arr_ten = tf.constant(arr)
# <tf.Tensor: shape=(3,), dtype=float64, numpy=array([1., 2., 3.])>

Numpy array 추출

arr_ten.numpy(), type(arr_ten.numpy())
# (array([1., 2., 3.]), numpy.ndarray)

li_ten.numpy(), type(li_ten.numpy())
# (array([1, 2, 3], dtype=int32), numpy.ndarray)

not_a_matrix = [[1,2,3], [4, 5], [6, 7, 8]]
# list, tf.constant(not_a_matrix) 에러

shape, dtype

연산 불가
- 1번 케이스: tf.matmul(li_ten, tu_ten) → li_ten * tu_ten = shape(3, ) * shape(2, 3) dimension 맞지 않음.
- 2번 케이스: tf.matmul(tu_ten, li_ten) 랭크 수가 맞지 않음
- 3번 케이스: arr_ten * li_ten dtype 다르므로 불가

li_ten.shape, tu_ten.shape
(TensorShape([3]), TensorShape([2, 3]))

arr_ten.dtype, li_ten.dtype
(tf.float64, tf.int32)

print(li_ten.ndim) # rank 수
print(tu_ten.ndim)

데이터 타입 컨트롤하는 방법

미리 지정
tf.cast(다만, 많은 경우 미리 데이터타입을 정리해둘 수 있다.)

# 미리 지정
tensor = tf.constant([1, 2, 3], dtype=tf.float32)

# tf.cast를 사용
tf.cast(tensor, dtype=tf.int16)

# 에러없이 연산
arr_ten * tf.cast(li_ten, tf.float64)

특정 값의 Tensor 생성

tf.ones
tf.zeros
tf.range

tf.ones(1)
# <tf.Tensor: shape=(1,), dtype=float32, numpy=array([1.], dtype=float32)>

tf.zeros((2, 5), dtype="int32")
'''
<tf.Tensor: shape=(2, 5), dtype=int32, numpy=
array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]], dtype=int32)>
'''

tf.zeros((2, 5, 3), dtype="int32")
'''
<tf.Tensor: shape=(2, 5, 3), dtype=int32, numpy=
array([[[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]]], dtype=int32)>
'''

tf.range(1, 11)
# <tf.Tensor: shape=(10,), dtype=int32, numpy=array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10], dtype=int32)>

n 을 입력하면 첫항이 1이고 공비가 2인 등비수열을 생성하는 함수를 만드시오
(이 때 결과값은 tf.Tensor 데이터이고, 데이터 타입은 tf.int32)

n = 10 \ 일 때 \\ \\ (1, 2, 4, 8, 16, 32, 64, 128, 256, 512)

def geometric_sequence(n):
    r = tf.range(n, dtype=tf.int32)
    # r = tf.range(n, 'int32')
    s = tf.ones(n, dtype=tf.int32) * 2
    return s**r
print(geometric_sequence(10))

3. Random value

무작위 값을 생성할 때 필요.
Noise를 재현 한다거나, test를 한다거나 할 때 많이 사용됨
데이터 타입은 상수형태로 반환됨
여러개 모듈 링크
tf.random 에 구현 되어 있음.
- tf.random.normal
  - Gaussian Normal Distribution
- tf.random.uniform
  - Uniform Distribution

shape = (3, 3)

tf.random.normal(shape)
# tf.random.normal(shape, mean=100, stddev=10)

tf.random.uniform(shape)

Random seed 관리 하기

Random value로 보통 가중치를 초기화
이외에도 학습과정에서 Random value가 많이 사용됨.
이를 관리 안해주면, 자신이 했던 작업이 동일하게 복구 또는 재현이 안됨
tf.random.set_seed({seed_number})
항상 Random seed를 고정해두고 개발 한다 (주의 할 점은 해당 개발물에 사용되는 난수가 모두 TensorFlow에서 생성된것이 아닐 수 있다는 것이다.)

seed = 7777
tf.random.set_seed(seed)
a = tf.random.uniform([1])
b = tf.random.uniform([1])
print(a, b, sep="\n")
# tf.Tensor([0.959749], shape=(1,), dtype=float32)
# tf.Tensor([0.8677443], shape=(1,), dtype=float32)

a = tf.random.uniform([1])
b = tf.random.uniform([1])
print(a, b, sep="\n")
# tf.Tensor([0.22878075], shape=(1,), dtype=float32)
# tf.Tensor([0.87772965], shape=(1,), dtype=float32)

#setting 다시 불러오기
tf.random.set_seed(seed) 
a = tf.random.uniform([1])
b = tf.random.uniform([1])
print(a, b, sep="\n")
# tf.Tensor([0.959749], shape=(1,), dtype=float32)
# tf.Tensor([0.8677443], shape=(1,), dtype=float32)

숫자를 끊는 precision → 가벼운 대신 오차가 생김

딥러닝에서는 중요. 현업에서는 속도를 높이려고 16bits 잘 사용.
double precision : 64bits
single precision : 32bits
half precision : 16bits

Reference
1) 제로베이스 데이터스쿨 강의자료

yeoni

데이터 사이언스 / just do it

다음 포스트

constants

Tensorflow

1. Tensor

2. Tensor 생성

항상 체크

Constant (상수)

Numpy array 추출

shape, dtype

데이터 타입 컨트롤하는 방법

특정 값의 Tensor 생성

3. Random value

Random seed 관리 하기

숫자를 끊는 precision → 가벼운 대신 오차가 생김

Variable(변수)

0개의 댓글