Hashing

๊ฐฑ๋‘ยท2021๋…„ 12์›” 28์ผ
0

๐Ÿ“š ์ž๋ฃŒ๊ตฌ์กฐ

๋ชฉ๋ก ๋ณด๊ธฐ
5/7
post-thumbnail

๐Ÿ“Œ Hashing

๋ฐ์ดํ„ฐ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ๊ด€๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด, ์ž„์˜์˜ ๊ธธ์ด ๋ฐ์ดํ„ฐ๋ฅผ ๊ณ ์ •๋œ ๊ธธ์ด์˜ ๋ฐ์ดํ„ฐ๋กœ ๋งคํ•‘ํ•˜๋Š” ๊ฒƒ
ํ•ด์‹œ ํ•จ์ˆ˜๋ฅผ ๊ตฌํ˜„ํ•˜์—ฌ ๋ฐ์ดํ„ฐ ๊ฐ’์„ ํ•ด์‹œ ๊ฐ’์œผ๋กœ ๋งคํ•‘ํ•œ๋‹ค.

โœ”๏ธ Static Hashing

ํ‚ค๋ฅผ ์ธ๋ฑ์Šค์— ๋งตํ•‘ํ•˜์—ฌ ๋ฐฐ์—ด(ํ•ด์‹œ ํ…Œ์ด๋ธ”)์—์„œ ํ‚ค๋ฅผ ์ €์žฅํ•˜๊ณ  ๊ฒ€์ƒ‰ํ•˜๋Š” ๊ธฐ์ˆ 

  • ํ…Œ์ด๋ธ”์˜ entry๋ฅผ ๋ฒ„ํ‚ท์ด๋ผ๊ณ  ํ•จ !

=> ๊ฒ€์ƒ‰, ์‚ฝ์ž…, ์‚ญ์ œ Performance : O(1)

โœ… ์˜ˆ์‹œ : ์ฒซ๋ฒˆ์งธ ๊ธ€์ž๋ฅผ ํ‚ค๋ฅผ ์‚ฌ์šฉํ•ด์„œ ํ•ด์‹ฑํ•จ

--------------------
| index |  bucket |
------------------
|   0   |  Korea  |
|   1   |  Japan  |
|   2   |  China  |
|   3   |  Thiwan |
|   4   |  Spain  |
--------------------

ํ•˜์ง€๋งŒ ๋ฐ์ดํ„ฐ๊ฐ€ ๋งŽ์•„์ง€๋ฉด ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ๊ฐ€ ๊ฐ™์€ ํ•ด์‹œ ๊ฐ’์œผ๋กœ ์ถฉ๋Œ๋‚˜๋Š” ํ˜„์ƒ์ด ๋ฐœ์ƒํ•จ

-------------------------------------
| index |  bucket |                  |
--------------------------------------
|   0   |  Korea  |                  |
|   1   |  Japan  |                  |
|   2   |  China  |                  |
|   3   |  Thiwan |     Thailand     |
|   4   |  Spain  |     Singapore    | -> Sri Lanka (overflow)
-------------------------------------

โœ… ํ•ด์‹œํ…Œ์ด๋ธ”์˜ ์žฅ์ 

๊ทธ๋Ÿผ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ํ•ด์‹œํ…Œ์ด๋ธ”์„ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ?

์ ์€ ์ž์›์œผ๋กœ ๋งŽ์€ ๋ฐ์ดํ„ฐ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ๊ด€๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด
ํ•˜๋“œ๋””์Šคํฌ๋‚˜, ํด๋ผ์šฐ๋“œ์— ์กด์žฌํ•˜๋Š” ๋ฌดํ•œํ•œ ๋ฐ์ดํ„ฐ๋“ค์„ ์œ ํ•œํ•œ ๊ฐœ์ˆ˜์˜ ํ•ด์‹œ๊ฐ’์œผ๋กœ ๋งคํ•‘ํ•˜๋ฉด ์ž‘์€ ๋ฉ”๋ชจ๋ฆฌ๋กœ๋„ ํ”„๋กœ์„ธ์Šค ๊ด€๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•ด์ง!

  1. ์ธ๋ฑ์Šค์— ํ•ด์‹œ๊ฐ’์„ ์‚ฌ์šฉํ•จ์œผ๋กœ์จ ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋ฅผ ์‚ดํ”ผ์ง€ ์•Š์•„๋„ ๊ฒ€์ƒ‰๊ณผ ์‚ฝ์ž…/์‚ญ์ œ๋ฅผ ๋น ๋ฅด๊ฒŒ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Œ.
    ํ•ด์‹œํ•จ์ˆ˜๋Š” ์–ธ์ œ๋‚˜ ๋™์ผํ•œ ํ•ด์‹œ๊ฐ’์„ ๋ฆฌํ„ด ํ•˜๊ณ , ํ•ด๋‹น ์ธ๋ฑ์Šค๋งŒ ์•Œ๋ฉด ํ•ด์‹œํ…Œ์ด๋ธ”์˜ ํฌ๊ธฐ์— ์ƒ๊ด€์—†์ด ๋ฐ์ดํ„ฐ์— ๋Œ€๋‹จํžˆ ๋น ๋ฅด๊ฒŒ ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ธ๋ฑ์Šค๋Š” ๊ณ„์‚ฐ์ด ๊ฐ„๋‹จํ•œ ํ•จ์ˆ˜(์ƒ์ˆ˜์‹œ๊ฐ„)๋กœ ์ž‘๋™ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋งค์šฐ ํšจ์œจ์ ์ž…๋‹ˆ๋‹ค.
    ์ฆ‰, ํผํฌ๋จผ์Šค๊ฐ€ O(1)์ด๊ธฐ ๋•Œ๋ฌธ

  2. ํ•ด์‹œํ…Œ์ด๋ธ”์„ ๋‹ค๋ฅธ ์ž๋ฃŒ๊ตฌ์กฐ์™€ ๋น„๊ตํ•ด๋ดค์„ ๋•Œ,

  • ์ด์ง„ํƒ์ƒ‰ํŠธ๋ฆฌ๋Š” ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ํŽธ์ด์ง€๋งŒ ๋ฐ์ดํ„ฐ ํƒ์ƒ‰์— ์†Œ์š”๋˜๋Š” ๊ณ„์‚ฐ๋ณต์žก์„ฑ์€ ๐‘‚(log n)
  • ๋ฐฐ์—ด(array)์˜ ๊ฒฝ์šฐ ๋ฐ์ดํ„ฐ ํƒ์ƒ‰์— ๋”ฐ๋ฅธ ๊ณ„์‚ฐ๋ณต์žก์„ฑ์€ ๐‘‚(1)์ด์ง€๋งŒ, ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋ฏธ๋ฆฌ ํ• ๋‹นํ•ด ๋‘ฌ์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ณต๊ฐ„ ํšจ์œจ์  โŒ
    ์ฆ‰, ํ•ด์‹œ๊ฐ€ ํ›จ์”ฌ ํšจ์œจ์ ์ž„
  1. ํ•ด์‹œ๋Š” ๋ณด์•ˆ ๋ถ„์•ผ์—์„œ๋„ ๋„๋ฆฌ ์‚ฌ์šฉ๋œ๋‹ค๊ณ  ํ•จ. ํ‚ค์™€ ํ•ด์‹œ๊ฐ’ ์‚ฌ์ด์— ์ง์ ‘์ ์ธ ์—ฐ๊ด€์ด ์—†๊ธฐ ๋•Œ๋ฌธ์— ํ•ด์‹œ๊ฐ’๋งŒ ๊ฐ€์ง€๊ณ ๋Š” ํ‚ค๋ฅผ ์˜จ์ „ํžˆ ๋ณต์›ํ•˜๊ธฐ ์–ด๋ ต๊ธฐ ๋•Œ๋ฌธ.
    ๋˜ํ•œ ํ•ด์‹œํ•จ์ˆ˜๋Š” ๊ธธ์ด๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ์ž…๋ ฅ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ์ผ์ •ํ•œ ๊ธธ์ด์˜ ์ถœ๋ ฅ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์–ด์„œ ๋ฐ์ดํ„ฐ ์ถ•์•ฝ ๊ธฐ๋Šฅ๋„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿš€ Hashing์˜ ๊ถ๊ทน์ ์ธ ๋ชฉ์ 

  1. Collision ์ตœ์†Œํ™”
  2. ์˜ค๋ฒ„ํ”Œ๋กœ์šฐ ์ตœ์†Œํ™”
  3. ํ•ด์‹œ ํ…Œ์ด๋ธ” ์‚ฌ์ด์ฆˆ ์ตœ์†Œํ™”
    => ํ•ด์‹œ ํ…Œ์ด๋ธ” ?
    n๊ฐœ์˜ ์ธ๋ฑ์Šค & ์ธ๋ฑ์Šค ๋ณ„๋กœ k๊ฐœ์˜ ๋ฒ„ํ‚ท => n*k ๊ฐœ์˜ ํ‚ค๊ฐ’ ์ €์žฅ ๊ฐ€๋Šฅ

๐Ÿš€ Hashing์˜ ๋‘๊ฐ€์ง€ ์š”์†Œ

  1. Hash function ๊ณ ๋ฅด๊ธฐ
  • modulo function : ๊ฐ€์žฅ ๋งŽ์ด ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•
    • H(key) = key mod hash_table_size
  • digital selection : key์—์„œ ๋ช‡๊ฐœ๋ฅผ ๊ณจ๋ผ์„œ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•
    • key ๊ฐ’์ด 9010302051218๋ฉด ๋ช‡๊ฐœ๋งŒ ๊ณจ๋ผ์„œ 000011๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•
      ์—„์ฒญ๋‚œ collision์„ ์กฐ์‹ฌํ•ด์•ผ ํ•จ..
  1. Collision ํ•ด๊ฒฐํ•˜๊ธฐ
    ๐Ÿ‘‡๐Ÿป

โœ… Collision ํ•ด๊ฒฐ

โœ”๏ธ Collision ์ž์ฒด๋ฅผ ์ค„์ด๋Š” ๊ฒƒ๋„ ์˜๋ฏธ๊ฐ€ ์žˆ์ง€๋งŒ, ์ค‘์š”ํ•œ ๊ฒƒ์€ collision์ด ํ•ด์‹œ๊ฐ’ ์ „์ฒด์— ๊ฑธ์ณ ๊ท ๋“ฑํ•˜๊ฒŒ ๋ฐœ์ƒํ•˜๊ฒŒ ํ•˜๋Š” ๊ฒƒ์ด ํฌ์ธํŠธ!

ํ˜„์žฌ๊นŒ์ง€ ๊ฐœ๋ฐœ๋œ ๊ฑฐ์˜ ๋ชจ๋“  ํ•ด์‹œํ•จ์ˆ˜๋Š” ํ•ด์‹œ์ถฉ๋Œ์„ ์ผ์œผํ‚ค๋Š” ๊ฒƒ์œผ๋กœ ํ™•์ธ๋๋‹ค๊ณ  ํ•จ
๊ทธ๋งŒํผ collision์ด ์•„์˜ˆ ์•ˆ ์ผ์–ด๋‚˜๊ฒŒ๋Š” โŒ, ๊ทธ๋ฆฌ๊ณ  ์–ด์จŒ๋“  collision์ด ์ผ์–ด๋‚œ ํ‚ค๋ฅผ ์ €์žฅํ•  ๊ณณ์ด ํ•„์š”ํ•จ.

1. Closed addressing

๊ณ„์‚ฐ๋œ ์ธ๋ฑ์Šค์— ๊ทธ๋Œ€๋กœ !

โœ”๏ธ ๊ฐ๊ฐ์˜ ์ธ๋ฑ์Šค์— collision์ด ๋‚œ ํ‚ค๋“ค์„ ์ €์žฅํ•  ์ˆ˜ ์žˆ๋Š” ๋„‰๋„‰~ํ•œ ๋ฒ„ํ‚ท์„ ๋‘ 
โœ”๏ธ ๋น ๋ฅธ access, overflow โŒ
โœ”๏ธ ๊ณต๊ฐ„๋‚ญ๋น„ ์ตœ์•… => ์‹ค์งˆ์ ์œผ๋กœ ์‚ฌ์šฉํ•˜์ง€ ๋ชปํ•จ

๐Ÿ“Œ 2D - array ์‚ฌ์šฉ

-------------------------------------------------------
| index |  bucket |                  |                |
-------------------------------------------------------
|   0   |  Korea  |                  |                |
|   1   |  Japan  |                  |                |
|   2   |  China  |                  |                |
|   3   |  Thiwan |     Thailand     |                |
|   4   |  Spain  |     Singapore    |    Sri Lanka   |
-------------------------------------------------------

๐Ÿ“Œ Overflow chaining (no collision bucket)

-------------------
| index |  bucket |
-------------------
|   0   |  Korea  |
|   1   |  Japan  |
|   2   |  China  |
|   3   |  Thiwan | -> Thailand
|   4   |  Spain  | -> Singapore -> Sri Lanka
-------------------

๐Ÿ“Ž ํƒ์ƒ‰ํ•˜๋Š” ๋ฐฉ๋ฒ•

  • insertํ•  ๋•Œ๋ž‘ ๊ฐ™์€ ํ•จ์ˆ˜ ์‚ฌ์šฉ
  • ์ธ๋ฑ์Šค์— ๋‚˜์˜จ ๋ฒ„ํ‚ท์—์„œ ํ‚ค๋ฅผ ์ฐพ์Œ
  • ๋ชป ์ฐพ์œผ๋ฉด chain์„ ํ™•์ธํ•˜๋ฉด์„œ ์ฐพ์Œ

2. Open addressing

๋‹ค๋ฅธ ์ธ๋ฑ์Šค์˜ ๊ฐ€๋Šฅํ•œ ๋ฒ„ํ‚ท์„ ์ฐพ์•„ ๋– ๋‚จ !

๐Ÿ“Œ 1) Linear Probing
collision์ด ๋ฐœ์ƒํ•˜๋ฉด ์—ฐ์†์ ์œผ๋กœ ์ญ‰ ๊ฒ€์ƒ‰ํ•˜๋ฉด์„œ ๋น„์–ด์žˆ๋Š” ๋ฒ„ํ‚ท์„ ์ฐพ์Œ

-------------------
| index |  bucket |
-------------------
|   0   |  Korea  |
|   1   |         |
|   2   |         |
|   3   |  Thiwan | 
|   4   |  Spain  | -> Singapore (Collision !!) ๐Ÿ‘‡๐Ÿป
-------------------
-------------------
| index |  bucket |
-------------------
|   0   |  Korea  | ๐Ÿ‘‡๐Ÿป
|   1   |Singapore| -> find free space ! 
|   2   |         |
|   3   |  Thiwan | 
|   4   |  Spain  | 
-------------------

๐Ÿ“Ž ํƒ์ƒ‰ํ•˜๋Š” ๋ฐฉ๋ฒ•

  • insertํ•  ๋•Œ๋ž‘ ๊ฐ™์€ ํ•จ์ˆ˜ ์‚ฌ์šฉ
  • ๋ชป ์ฐพ์œผ๋ฉด ๋งž๋Š” ๊ฒƒ์„ ์ฐพ๊ธฐ ์œ„ํ•ด linear probingํ•จ

๐Ÿ“Ž ๋ฌธ์ œ์ 

  • homeless ํ‚ค๊ฐ€ ์‚ญ์ œ๋  ์ˆ˜๋„ ์žˆ์Œ
    => ์—ฌ๊ธฐ์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ฌธ์ œ : ์‚ญ์ œ๋œ homeless ํ‚ค๊ฐ€ ์‚ญ์ œ๋˜๊ณ , ๊ทธ ๋‹ค์Œ์˜ homeless ํ‚ค๋ฅผ ์ฐพ๊ณ  ์‹ถ์„ ๋•Œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•จ
0Korea
1Singapore
2Qatar
3Sri Lanka
4China
5Taiwan
6Spain

์—ฌ๊ธฐ์„œ homeless key์ธ Singapore๋ฅผ ์‚ญ์ œํ•จ

0Korea
1๐Ÿ‘ˆ๐Ÿป ์—ฌ๊ธฐ์„œ ์„œ์น˜๊ฐ€ ๋ฉˆ์ถค
2Qatar
3Sri Lanka
4China
5Taiwan
6Singapore

๊ทธ ํ›„์— Sri Lanka๋ฅผ ์ฐพ์œผ๋ ค๊ณ  ํ•  ๋•Œ ์„œ์น˜๊ฐ€ ๋ฉˆ์ถฐ๋ฒ„๋ฆผ

โœ”๏ธ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด State ๋ฅผ ๋‘๊ธฐ ์‹œ์ž‘ => Occupied / Empty / Deleted

  • ํ•˜์ง€๋งŒ ํฐ ๋ฌธ์ œ์ ์ด ์žˆ์Œ
    homeless key๋ฅผ ์œ„ํ•ด ๋นˆ ๊ณต๊ฐ„์„ ์คฌ์ง€๋งŒ ๊ทธ๊ฑธ๋กœ ์ธํ•ด ์—ฐ์‡„์ ์œผ๋กœ homeless key๊ฐ€ ๋ฐœ์ƒ
    => ๊ฒฐ๊ตญ search performane๊ฐ€ O(n)

๐Ÿ“Œ 2) Quadratic Probing
Linear probing์˜ ์ˆ˜์ • ๋ฒ„์ „์ž„
collision์ด ๋ฐœ์ƒํ•˜๋ฉด k^2๋งŒํผ ์›€์ง์ด๋ฉด์„œ ๊ฒ€์ƒ‰ํ•ด์„œ ๋น„์–ด์žˆ๋Š” ๋ฒ„ํ‚ท์„ ์ฐพ์Œ

  • ๋ถ€๋ถ„์ ์œผ๋กœ Linear probing์˜ ๋ฌธ์ œ์ ์ด์—ˆ๋˜ ๊ทธ ์—ฐ์‡„์ ์œผ๋กœ homeless key ๋ฐœ์ƒ์„ ํ•ด๊ฒฐํ•จ
  • ์ฒ˜์Œ์— k=1, 1๋งŒํผ ์›€์ง์—ฌ์„œ ๋นˆ์นธ์„ ์ฐพ๊ณ  ๋นˆ ๋ฒ„ํ‚ท์ด ์•„๋‹ˆ๋ผ๋ฉด k=2, 2^2๋งŒํผ ์›€์ง์—ฌ์„œ ๋นˆ ๋ฒ„ํ‚ท์„ ์ฐพ์Œ

๐Ÿ“Œ 3) Double Hashing
๋‘๊ฐœ์˜ hash function์„ ์‚ฌ์šฉํ•จ
ํ•˜๋‚˜๋Š” ์ธ๋ฑ์Šค๋ฅผ ์ฐพ๋Š”๋ฐ ์‚ฌ์šฉ, ํ•˜๋‚˜๋Š” homeless key๋ฅผ ๋„ฃ๋Š” interval ์ธ๋ฑ์Šค๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉํ•จ

  • ์˜ˆ๋ฅผ ๋“ค์–ด, h1 = key mod 13 / h2 = 1 + key mod 11 ์ด๋ ‡๊ฒŒ ์‚ฌ์šฉ
  • h2๋Š” h1๊ณผ ๊ฐ™์œผ๋ฉด โŒ

โœ”๏ธ linear & quadratic๊ณผ์˜ ์ฐจ์ด์  : ์ธ๋ฑ์Šค๊ฐ€ key๊ฐ’์— ๋”ฐ๋ผ ๊ฒฐ์ •๋œ๋‹ค.

๐Ÿš€ Hashing์˜ ๋ฌธ์ œ์ 

  • Range ์ฟผ๋ฆฌ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Œ
  • ์–ด๋ ˆ์ด๋ฅผ ์‚ฌ์šฉํ•ด์„œ ์–ด๋ ˆ์ด๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ์˜ ๋ณดํŽธ์ ์ธ ๋ฌธ์ œ์ ์ด ๋ฐœ์ƒ
profile
๐Ÿ‘ฉ๐Ÿปโ€๐Ÿ’ป๐Ÿ”ฅ

0๊ฐœ์˜ ๋Œ“๊ธ€