yolo v7 export.py 분석하기

게으른 개미개발자·2022년 9월 27일

Onnx TensorRT inference torch.jit.script torch.jit.trace torch2onnx torch2trt yolov7

model_conversion

목록 보기

2/13

어제 torch.jit.script 활용하여 pt 파일을 생성해보려 했지만, 제대로 되지 않았다.

따라서, yolo v7의 소스코드에 있는 export.py 를 분석하여, 어떤 방식으로 pyTorch 모델을 architecture와 parameter가 담긴 pt파일이 생성된다.

어떤 방식으로 이루어지는지 알아보도록 하겠다.

python export.py --weights yolov7-tiny.pt --grid --end2end --simplify \
        --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640 --max-wh 640

위의 명령어를 활용하여 export 시킨다.

weights : 가중치 파일
- release 폴더에서 download 받아와서 사용한다.
- pt 파일이며, 단순히 torch.save를 했을 때, 갖는 파일과 동일하다
- node 정보는 존재하나, models/common의 class단위로 node 정보를 표시해놨다.
grid : Detect() 클래스에 존재하는 layer grid를 저장하는 옵션
end2end : onnx 파일로 저장하는 옵션
simplify : simplify onnx file
topk-all : 모든 이미지에 대하여 topk 적용
iou-thres : iou 임계치 (NMS에 대한)
conf-thres : conf 임계치 (NMS에 대한)
img-size : 이미지 사이즈 (640,640)
max-wh : onnxruntime int value

1. pt 파일 load하기

model = attempt_load(opt.weights, map_location=device)  # load FP32 model

attempt_load() 함수 부분

model = Ensemble()

https://pytorch.org/docs/stable/generated/torch.nn.ModuleList.html

모듈 정보를 담기위해 초기화해주는 작업

ckpt = torch.load(w, map_location=map_location)

ckpt에 담겨있는 정보
- model : <class ‘models.yolo.Model’>
- optimizer : None
- training results : None
- epochs : -1

for w in weights if isinstance(weights, list) else [weights]:
        #attempt_download(w)
        ckpt = torch.load(w, map_location=map_location)  # load
        model.append(ckpt['ema' if ckpt.get('ema') else 'model'].float().fuse().eval())  # FP32

ckpt[’model’] 안에 있는 정보를 model 인스턴스에 담아준다.

for m in model.modules():
        if type(m) in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU]:
            m.inplace = True  # pytorch 1.7.0 compatibility
        elif type(m) is nn.Upsample:
            m.recompute_scale_factor = None  # torch 1.11.0 compatibility
        elif type(m) is Conv:
            m._non_persistent_buffers_set = set()  # pytorch 1.6.0 compatibility

torch.nn을 상속받은 model을 load하고 node별로 type에 따라서 알맞는 초기화를 진행해줌.

activation func : inplace=True로 값 변경
Upsample : 이미지 사이즈 조절을 위해 recompute_scale_factor 초기화 https://pytorch.org/docs/stable/generated/torch.nn.Upsample.html
Conv : layer 구성을 위해 버퍼 사이즈 초기화 _non_persistent_buffers_set

labels = model.names

labels 안에 객체 탐지할 클래스 정보가 담겨있음.

gs = int(max(model.stride))  # grid size (max stride)

model script 방식으로 save할 때, stride 정보가 models/yolo.py 안의 Model 클래스에서 계산할 때, Datatype 및 size 등 다양한 문제가 존재했었는데, 인스턴스 로드했을 때는 정상적으로 잡혀서 값이 들어간 것을 볼 수 있음.

# Input
    img = torch.zeros(opt.batch_size, 3, *opt.img_size).to(device)  # image size(1,3,320,192) iDetection

input size를 조절해줌.

아직은 모르겠지만 여기서 문제가 있을거라 예상됨…

input tensor를 dummy input으로 (32,3,256,256)으로 맞춰줬었는데, (1,3,640,640)으로 세팅해줌.

이후, 모델을 다시 한 번 update 해준다.

# Update model
    for k, m in model.named_modules():
        m._non_persistent_buffers_set = set()  # pytorch 1.6.0 compatibility
        if isinstance(m, models.common.Conv):  # assign export-friendly activations
            if isinstance(m.act, nn.Hardswish):
                m.act = Hardswish() 
            elif isinstance(m.act, nn.SiLU):
                m.act = SiLU()
        # elif isinstance(m, models.yolo.Detect):
        #     m.forward = m.forward_export  # assign forward (optional)
    model.model[-1].export = not opt.grid  # set Detect() layer grid export

해당되는 활성화함수가 해당 레이어에 제대로 담겼는지, 그게 아니면 업데이트해주는 과정으로 보인다.

그리고 마지막 레이어인 Detect 레이어에서는 Detect Layer를 export 할 것인지, 아닌지에 대해 선택해준다.

2. torch.jit.trace()

f = opt.weights.replace('.pt', '.torchscript.pt')  # filename
ts = torch.jit.trace(model, img, strict=False)
ts.save(f)

결국 앞서 초기화하고 설정해준 pt 정보를 f라는 변수에 넣어주고, torch.jit.trace 를 활용하여 ts라는 변수에 담아주게 된다. 이후, ts.save(f) 명령어를 활용하여 저장해주게 된다.

# ONNX export
    try:
        import onnx

        print('\nStarting ONNX export with onnx %s...' % onnx.__version__)
        f = opt.weights.replace('.pt', '.onnx')  # filename
        model.eval()

기존의 가중치 파일이 담긴 f를 onnx 파일로 변경해주고, model.eval() 모드로 변경해줘서, Dropout,이나 BatchNormalization과 같이 evaluation 과정에서 필요없는 모듈을 onnx에 담지 않도로 해주었다.

NonMaxSuppression(NMS)

class End2End(nn.Module):
    '''export onnx or tensorrt model with NMS operation.'''
    def __init__(self, model, max_obj=100, iou_thres=0.45, score_thres=0.25, max_wh=None, device=None, n_classes=80):
        super().__init__()
        device = device if device else torch.device('cpu')
        assert isinstance(max_wh,(int)) or max_wh is None
        self.model = model.to(device)
        self.model.model[-1].end2end = True
        self.patch_model = ONNX_TRT if max_wh is None else ONNX_ORT
        self.end2end = self.patch_model(max_obj, iou_thres, score_thres, max_wh, device, n_classes)
        self.end2end.eval()

    def forward(self, x):
        x = self.model(x)
        x = self.end2end(x)
        return x

변환이 되지 않는 NMS의 경우는 위와 같이 커스텀해서 변환해주는 옵션인거 같다.

torch.onnx.export(model, img, f, verbose=False, opset_version=12, input_names=['images'],
                          output_names=output_names,
                          dynamic_axes=dynamic_axes)

결국 허무하지만, onnx export 하는 것은 일반적인 방법론과 똑같았다.

결국은 trace인가…

다시 한 번 요약을 하자면

pyTorch 코드가 존재한다고 가정했을 때, Forwarding 함수 부분을 torch.save 를 활용하여 저장해준다.
가중치(parameter)파일이 존재할 경우, 가중치 파일을 업데이트 해준다.
모델 업데이트를 진행한다.
1. Convolution : 버퍼 할당
2. Activation Func : 초기화 작업
3. Image related Func : torch.nn.Upsample recompute scale factor 초기화
torch.jit.trace 사용해서 Trace방식으로 모델 Export
yolo의 경우, bounding box 및 후처리 작업을 onnx모델에 포함시킴(일반적인 classification은 제외)
1. 마지막 레이어인 Detect() 클래스
2. output names 수정해줘야함
```
output_names = ['num_dets', 'det_boxes', 'det_scores', 'det_classes']
```
onnx.export 모듈 활용하여 Onnx 모델로 Export 해 줌.
1. 추가적으로 yolo v7의 경우 TRT에서 지원되지 않는, NMS 함수를 따로 구현하여 등록해주었음.

게으른 개미개발자

특 : 미친듯한 게으름과 부지런한 생각이 공존하는 사람

이전 포스트

torch save 아키텍처 파라미터 저장하기

다음 포스트