pytorch에서 .to(device)를 언제 써야 하는지

개발하는 G0·2023년 7월 27일

모델 (신경망)을 GPU로 이동시킬 때:
- 모델의 매개변수. ex) nn.Linear()
- 비용 함수. ex) nn.CrossEntropyLoss()
데이터를 GPU로 이동시킬 때:
- 모델에 입력 데이터를 전달하기 전에, 입력 데이터 또한 GPU로 이동시켜야 합니다.

class Teacher(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(28*28, 1200)
        self.bn1 = nn.BatchNorm1d(1200)
        self.fc2 = nn.Linear(1200,1200)
        self.bn2 = nn.BatchNorm1d(1200)
        self.fc3 = nn.Linear(1200, 10)

    def forward(self,x):
        x = x.view(-1, 28*28)
        x = F.relu(self.bn1(self.fc1(x)))
        x = F.dropout(x,p=0.8)
        x = F.relu(self.bn2(self.fc2(x)))
        x = F.dropout(x,p=0.8)
        x = self.fc3(x)
        return x
# check
x = torch.randn(16,1,28,28).to(device)
teacher = Teacher().to(device)
output = teacher(x)
print(output.shape)

여기서 Teacher()에 .to(device)를 하는 건 모델을 .to(device) 하는 거야?

# check
x = torch.randn(16,1,28,28).to(device)
teacher = Teacher()
output = teacher(x).to(device)
print(output.shape)

이렇게 하면 안 돼?

-> 네, 맞습니다. 주어진 코드에서 Teacher 모델의 인스턴스에 .to(device)를 적용하는 것은 모델 전체를 지정한 device로 이동하는 올바른 방법입니다. 이렇게 하면 모델의 모든 파라미터와 계산이 지정한 장치에서 이루어질 것입니다.

따라서, 올바른 사용법은 다음과 같습니다:

x = torch.randn(16, 1, 28, 28).to(device)
teacher = Teacher().to(device)
output = teacher(x)
print(output.shape)

이 코드는 입력 텐서 x와 teacher 모델을 모두 지정한 device로 이동시키고, 입력 데이터를 이용하여 모델의 순전파를 수행합니다.

한편, 두 번째 체크에서는 다음과 같이 사용합니다:

x = torch.randn(16, 1, 28, 28).to(device)
teacher = Teacher()
output = teacher(x).to(device)
print(output.shape)

여기서는 순전파가 이미 완료된 후에 출력 텐서를 장치로 이동시킵니다. 이 경우 모델 자체를 지정한 장치로 이동시키지 않으며, 모델의 파라미터와 입력 텐서의 장치 타입이 다르다면 오류가 발생할 수 있습니다. 모든 계산이 동일한 장치에서 이루어지도록 하려면, 순전파를 수행하기 전에 모델을 지정한 장치로 이동시키는 것이 중요합니다.

개발하는 G0

초보 개발자

이전 포스트

numpy matrix multiplication

다음 포스트

pytorch에서 .to(device)를 언제 써야 하는지

numpy matrix multiplication

pytorch에서 nn.Module.apply() 사용하기

0개의 댓글