[CMPT 454] Week 5_1

June·2021년 2월 16일
0

CMPT 454

목록 보기
13/33

Pass 1, 2

2-Way External Merge Sort

General External Merge Sort

Pass 0

Pass 1, 2, ....

General External Merge Sort

I/O Pages of External Merge Sort

Number of Passes of External Sort

Blocked buffer: I/O pages vs I/O time

  • If pages are sequentially stored on disk, I/O time per page can be reduced by having larger (b>1) input buffer.
    • read b pages with 1 seek time and 1 rotational delay.
  • (B-1)/b runs will be merged at a time, thus, more passes and more I/O pages.

2 Choices for 5-page Buffer (B=5)

I/O cost in terms of number of page 는 오른쪽이 더 크다. more pass to merge because because each time can merge two. 하지만 오른쪽은 smaller I/O time per page이다. disk에 sequentially하게 저장되어있으면 1 seek time만 필요하기 때문이다.

Sorting cost for blocked buffers

Using B+ Trees for Sorting

  • If the file to be sorted has B+ tree index on the sorting column(s), can retrieve records in order by traversing leaf pages.
  • Is this a good idea?
    • If B+ tree is clustered, very good idea!
    • If B+ tree is not clustered, could be a very bad idea

B+ tree with searh key <A,B,C>. Can it be used to sort on A, on (A,B), on B, on (A,C)?

처음 두 개의 경우에는 B+트리 사용하기 좋고, 나머지는 좋은 경우가 아니다.

Clustered B+ Tree Used for Sorting

Unclustered B+ Tree Used for Sorting

1*에서 p1 읽고, 2*에서 p2 읽고 3*에서 p3 읽고 4*에서 p4 읽고 5*에서 다시 p1 읽고 ...
-> 총 8 page를 읽는다.

External Sorting vs. Unclustered Index

Summary

Replacement Sort (pp429)

0개의 댓글