[C] Indexed Priority Queue (Indexed Heap)

숲사람·2022년 4월 19일

heap indexed priority queue

Data Structure & Algorithm of C

목록 보기

4/12

Indexed Priority Queue (Indexed Heap) 이란

heap의 특정값을 수정할 수 있는 heap자료구조.
a[] 라는 배열이 주어지고, heap[]으로 변환. heap[]상에 있는 a[i]에 해당하는 값을 바로 수정하고, heap 을 유지할 수 있는 자료구조.

시간복잡도

가령 배열 a[0], a[1] ... a[n-1]가 있을 때 아래의 query들을 수행하는 코드를 구현한다고 해보자.

update(i, v) a[i] = v로 바꾼다.
get_min() a[0]~a[n-1] 중 가장 작은 값을 출력

만약 단순 heap으로 이를 구현한다면, update가 일어날때마다 n만큼 배열을 탐색해 a[i]를 수정한 뒤, 배열을 통째로 heapify를 하면 O(N) + O(N) 이 된다.

하지만 indexed heap을 통해 특정 배열값만 수정 & heap구조 유지하면 O(1) + O(logN)이 된다.

동작 방식

1. a[i]에 해당하는 값이 Heap상에서 어디에 위치하는지를 O(1)만에 알아내고
2. 그 노드의 값을 v로 바꿔버리고
3. Heap 자료구조를 유지시키도록 노드를 O(log n)만에  옮긴다.

heap_index[] 배열을 계산하고 swap때마다 업데이트하는것이 핵심.
i 가 주어지면 heap[]에서 a[i]가 저장되어있는 index를 얻을 수 있다-> 바로heap_index[i]를 통해서.
heap_index[i] 에 a[i]의 heap내에서의 index를 계속 업데이트 해야함.
따라서 heap[heap_index[i]] 가 현재 heap에서 a[i] 노드가 위치한 곳.

void heap_update_direct(int i, int v)
{
  heap[heap_index[i]].val = v;
 ... 중략
}

heap_index[i] 를 통해 a[i]가 현재 heap어디에 위치하는지 파악가능. 이 것을 push/pop할 때 저장하고 유지해야한다.

우선 heap 자료구조가 아래와 같다고 가정.

struct elem {
    int idx;
    int val;
};
struct pq {
    int heap_size;
    struct elem *heap;
};
struct pq *q;
heap_index[];       // <-

heap_push

a[i] 를 push하면 해당 노드는 heap의 가장 마지막에 저장된다. 따라서 heap에 저장된 a[i]의 index 는 현재 heap 사이즈가 됨.

heap_index["push하는 노드의 idx"] = "현재 heap에서의 idx";

void heap_push(struct elem heap[], int val, int idx)
{
    heap[q->heap_size].val = val;
    heap[q->heap_size].idx = idx;
    heap_index[idx] = q->heap_size;      // <-
    q->heap_size++;
    heapify_up(heap, q->heap_size - 1);
}

그 뒤 heapify_up() 을 수행하면 해당노드가 부모드와 swap이 일어날텐데 그때마다 heap_index[]도 같이 업데이트 해주어야한다.

void heapify_up(struct elem heap[], int curr_idx)
{
    int p_idx = (curr_idx - 1) / 2;
    
    if (curr_idx < 1)
        return;
    if (heap[curr_idx].val < heap[p_idx].val) {
        swap(heap, curr_idx, p_idx);
        heap_index[heap[curr_idx].idx] = curr_idx;   // <-
        heap_index[heap[p_idx].idx] = p_idx;         // <-
        heapify_up(heap, p_idx);
    }
}

이 부분

        swap(heap, curr_idx, p_idx);
        heap_index[heap[curr_idx].idx] = curr_idx;   // <-
        heap_index[heap[p_idx].idx] = p_idx;         // <-

swap이후 heap[curr_idx].idx 는 p_idx의 것으로 바뀌어있으므로 기존 a[p_idx] 가 heap의 어떤 idx에 존재하는지는 즉, heap_index[p_idx]는 curr_idx로 swap해야함.

heap[curr_idx] 와 heap[p_idx]를 swap하면 heap[curr_idx]는 p_idx에 해당하는 struct elem요소가 저장되어있다. 따라서 기존 p_idx 에 해당하는 heap_index[]를 curr_idx로 업데이트 해주어야함. 결과적으로 heap_index[]배열도 swap이 됨.

heap_pop

pop을할땐 heap의 마지막 요소가 [0]으로 올라오므로, heap_index[]를 0으로 변경.
heapify_down할때도 동일하게 heap_index도 swap하여 업데이트 하면됨.

struct elem heap_pop(struct elem arr[], bool (*cmp)(struct elem, struct elem))
{
    struct elem ret = arr[0];
    swap(arr, 0, q->heap_size -1);
    q->heap_size--;
    heap_index[arr[0].idx] = 0;    //  <-
    heapify_down(arr, 0, q->heap_size, cmp);
    return ret;
}

void heapify_down(struct elem arr[], int p_idx, int size, bool (*cmp)(struct elem, struct elem))
{
    int min_idx = p_idx;
    int l_idx = p_idx * 2 + 1;
    int r_idx = p_idx * 2 + 2;
    
    if (l_idx < size && cmp(arr[l_idx], arr[min_idx]))
        min_idx = l_idx;
    if (r_idx < size && cmp(arr[r_idx], arr[min_idx]))
        min_idx = r_idx;
    if (min_idx != p_idx) {
        swap(arr, min_idx, p_idx);
        heap_index[heap[min_idx].idx] = min_idx;   // <-
        heap_index[heap[p_idx].idx] = p_idx;         // <-
        heapify_down(arr, min_idx, size, cmp);
    }
}

heap_update_direct

마지막으로 처음 구현하려고 했던 update_direct 를 완성하면 아래와 같다. 위의 동작방식에서 3번에 해당하는 내용은 아래와 같이 구현이 가능하다. heapify_up() 과 heapify_down()이 동시에 나열되어있지만 두 함수가 모두 수행되지는 않는다. compare조건때문에 둘중에 하나만 수행하게 된다. 따라서 O(logN)에 heap구조로 수정이 된다.

void heap_update_direct(int i, int v)
{
  heap[heap_index[i]].val = v;
  hepify_up(heap_index[i]);
  hepify_down(heap_index[i]);
}

references

http://www.secmem.org/blog/2020/08/16/heap/
https://blog.naver.com/sagik921123/221690373557

숲사람

기록 & 정리 아카이브용

이전 포스트

[C] 우선순위 큐(Heap) 및 Heap Sort 구현

다음 포스트