[이미지 처리] opencv로 1024x1024 크기 이미지를 256x256 여러 장으로 crop하고, 불필요한 부분은 버리기

Daehyeon Choi·2023년 7월 25일

인트로

요즘 회사에서 위성 사진을 많이 다루고 있는데, 위성 사진들의 특징은 일반적인 사진들보다 resolution이 떨어지고, 픽셀 크기는 크다는 것이다.

예를 들어, Remote Semantic Segmentation 데이터셋인 'LoveDA'같은 경우에는, 1024x1024의 크기를 가진다. 하지만, General한 task에서 사용되는 모델의 경우 주로 256x256 크기의 이미지를 입력으로 받고, 다양한 크기의 이미지를 받을 수 있도록 argument로 조절할 수도 있지만 1024x1024 정도 되는 크기의 이미지는 지원하지 않는 경우가 많다.

LoveDA github

나의 경우에도 제작하고 있는 Semantic Change Detection 모델이 일단 256x256을 타겟하고 있는데, LoveDA를 바로 사용할 수는 없다. opencv 라이브러리 사용도 연습해보고, 즉시 사용할 수 있는 데이터셋을 확보할 겸 1024x1024 이미지를 256x256으로 crop하는 코드를 작성하고 적용해보았다.

이 작업은 단순히 256x256만큼 이미지를 오려내는 게 아니라, 큰 이미지를 16개의 subpatch로 만드는 작업이다. 또한, LoveDA의 경우 위성 사진으로 구성된 데이터셋이기 때문에 사진의 일부분이 아예 까맣게, '찍히지 않은' 경우도 있다.

위 사진의 크기는 다른 사진들처럼 1024x1024인데, 윗부분 중 거의 절반이 쓸모없는 부분인 걸 확인할 수 있다.

이런 부분이 모델의 성능을 떨어뜨리냐?라는 질문에는 배움이 짧아 답할 수 없지만, 모델을 학습시키는 데 도움이 되지는 않는다고는 말할 수 있을 것 같다.
따라서 이미지를 크롭하는 과정에서 이런 불필요한 부분을 버릴 수 있도록 하는 코드도 작성해보았다. 코드는 다음과 같다.

코드 & 알고리즘

import os
import cv2

def crop_images(img_folder, mask_folder, output_img_folder, output_mask_folder):
    # 입력 폴더 내의 파일 목록을 얻습니다.
    img_list = os.listdir(img_folder)
    mask_list = os.listdir(mask_folder)
    
    # 파일 목록을 순회하며 이미지를 crop합니다.
    for file_name in img_list:
        img_path = os.path.join(img_folder, file_name)
        mask_path = os.path.join(mask_folder, file_name)
        
        # image import 
        img = cv2.imread(img_path)
        mask = cv2.imread(mask_path)

        # 이미지가 존재하지 않는 경우 스킵
        if img is None:
            print(f"Skipping {file_name} as it could not be read.")
            continue

        # 이미지를 256x256 크기로 crop
        height, width = img.shape[:2]
        if height < 256 or width < 256:
            print(f"Skipping {file_name} as it is too small to crop (size: {width}x{height}).")
            continue
        
        # Left up 

        crop_x = 0
        crop_y = 0
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        # crop된 이미지를 저장
        
        file_name = file_name[:-4]
        # print(file_name)
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_0.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_0.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
        # else discard image 
        
        crop_x = 256
        crop_y = 0
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_1.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_1.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            

        crop_x = 512
        crop_y = 0
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_2.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_2.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            

        crop_x = 768
        crop_y = 0
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_3.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_3.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
        
        crop_x = 0
        crop_y = 256
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_4.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_4.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
        crop_x = 256
        crop_y = 256
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_5.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_5.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
        crop_x = 512
        crop_y = 256
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_6.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_6.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
        crop_x = 768
        crop_y = 256
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_7.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_7.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
        
        crop_x = 0
        crop_y = 512
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_8.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_8.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
        crop_x = 256
        crop_y = 512
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_9.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_9.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
        crop_x = 512
        crop_y = 512
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_10.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_10.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
        crop_x = 768
        crop_y = 512
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_11.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_11.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
        crop_x = 0
        crop_y = 768
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_12.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_12.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
    
        crop_x = 256
        crop_y = 768
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_13.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_13.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
        crop_x = 512
        crop_y = 768
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_14.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_14.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
        crop_x = 768
        crop_y = 768
        cropped_img = img[crop_y:crop_y + 256, crop_x:crop_x + 256]
        cropped_mask = mask[crop_y:crop_y + 256, crop_x:crop_x + 256]
        
        check_invalid = cropped_img[:64, :64]
        if check_invalid.sum() != 0: 
            output_img_path = os.path.join(output_img_folder, f"cropped_{file_name}_15.png")
            output_mask_path = os.path.join(output_mask_folder, f"cropped_{file_name}_15.png")
            cv2.imwrite(output_img_path, cropped_img)
            cv2.imwrite(output_mask_path, cropped_mask)
            
# 입력 폴더와 출력 폴더를 설정합니다.
img_folder = "/home/dh.choi/ddpmcd/guided-diffusion/datasets/Train/images_png"  # 입력 폴더 경로를 지정해주세요.
mask_folder = "/home/dh.choi/ddpmcd/guided-diffusion/datasets/Train/masks_png"
output_img_folder = "/home/dh.choi/ddpmcd/guided-diffusion/datasets/Train_crop/images_png"  # 출력 폴더 경로를 지정해주세요.
output_mask_folder = "/home/dh.choi/ddpmcd/guided-diffusion/datasets/Train_crop/masks_png"  # 출력 폴더 경로를 지정해주세요.

crop_images(img_folder, mask_folder, output_img_folder, output_mask_folder)

사진을 버리는 알고리즘은 간단한데, 각 패치를 얻을 때 마다 해당 위치의 픽셀 값의 총 합이 0이면 이미지를 path에 저장하지 않는 것이다. 이 때 픽셀 값의 총 합 threshold를 좀 더 높여서 더 쓸모있는 사진만 얻을 수도 있을 것! 지금의 알고리즘에서는, 패치의 모든 부분이 까매야지만 사진을 버린다.

(참고로, 예시로 잘라 본 LoveDA는 Segmentation Dataset이기 때문에 Segmentation map (mask)도 동시에 잘라주고, 불필요한 패치와 같은 위치의 마스크도 동시에 버려주었다. 만약 일반적인 사진 crop이 필요하다면, mask 처리 부분을 지우고 응용할 수 있을 것이다.)

결과

참고로 위의 예시 사진이 4191번이다.

크롭을 하고 나니 4191의 윗 부분 패치는 모두 버렸음을 알 수 있다.

Daehyeon Choi

Master Student @ KAIST CS / Generative Modeling

다음 포스트

[이미지 처리] opencv로 1024x1024 크기 이미지를 256x256 여러 장으로 crop하고, 불필요한 부분은 버리기

인트로

코드 & 알고리즘

결과

[이미지 처리] 두 이미지 pixelwise sum으로 Semantic Change Map 구하기

0개의 댓글