๋ณธ ๋‚ด์šฉ์˜ ์ถœ์ฒ˜๋Š” 262๊ฐ€์ง€ ๋ฌธ์ œ๋กœ ์ •๋ณตํ•˜๋Š” ์ฝ”๋”ฉ ์ธํ„ฐ๋ทฐ in Java์— ์žˆ์œผ๋ฉฐ, ์ถ”๊ฐ€ํ•œ ๋‚ด์šฉ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


Day1


โ˜๏ธ Intro: ์ฃผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋ฌด์—‡์ผ๊นŒ?

์ถ”์ฒœ ์‹œ์Šคํ…œ?

  • ์ธํ„ฐ๋„ท์ด ๋ฐœ์ „๋จ์— ๋”ฐ๋ผ ์•„์ดํ…œ ๊ตฌ๋งค ๋ฐ ์„ ํ˜ธ์— ๋Œ€ํ•œ ์‚ฌ์šฉ์ž์˜ ํ”ผ๋“œ๋ฐฑ์„ ์–ป๊ธฐ ์‰ฌ์›Œ์กŒ๊ณ , ์ด๋Ÿฌํ•œ ํ”ผ๋“œ๋ฐฑ์„ ๋ฐ”ํƒ•์œผ๋กœ ๊ณผ๊ฑฐ์˜ ์‚ฌ์šฉ์ž - ์•„์ดํ…œ ๊ฐ„ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•ด ์•„์ดํ…œ์„ ์ถ”์ฒœํ•˜๋Š” ๊ฒƒ์ด ์ถ”์ฒœ ์‹œ์Šคํ…œ์˜ ๊ธฐ๋ณธ์  ์•„์ด๋””์–ด.
  • ์ฆ‰, User์˜ ์„ ํ˜ธ Item์„ ์˜ˆ์ธกํ•˜๋Š” ์‹œ์Šคํ…œ.

์ถ”์ฒœ ์‹œ์Šคํ…œ์˜ ๊ตฌ์กฐ

  • ์ „์ฒด ๊ตฌ์กฐ๋Š” "ํ›„๋ณด๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋‹จ๊ณ„"์™€ "๋žญํ‚น์„ ๋งค๊ธฐ๋Š” ๋‹จ๊ณ„"๋กœ ๊ตฌ๋ถ„
  1. ํ›„๋ณด๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋‹จ๊ณ„
    a. ์ˆ˜๋ฐฑ๋งŒ ๊ฐœ ์•„์ดํ…œ ์ค‘ ์‚ฌ์šฉ์ž์˜ ํ™œ๋™ ๊ธฐ๋ก ๊ธฐ๋ฐ˜์œผ๋กœ ํ›„๋ณด๊ฐ€ ๋  ์•„์ดํ…œ๋“ค์„ ์„ ์ •ํ•˜๋Š” ๋‹จ๊ณ„
    b. ๋†’์€ ์ •๋ฐ€๋„๋กœ ์‚ฌ์šฉ์ž์™€ ๊ด€๋ จ์ด ์žˆ๊ณ , ํ˜‘์—… ํ•„ํ„ฐ๋ง์„ ํ†ตํ•ด์„œ๋งŒ ๊ด‘๋ฒ”์œ„ํ•œ ๊ฐœ์ธํ™”๋ฅผ ์ œ๊ณต
    c. list์—์„œ ์ตœ์ƒ์˜ list๋ฅผ ์ œ์‹œํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” recall์ด ๋†’์€ ํ›„๋ณด ๊ฐ„ ์ƒ๋Œ€์  ์ค‘์š”์„ฑ ๊ตฌ๋ถ„์„ ์œ„ํ•ด ์„ธ๋ฐ€ํ•œ ์ˆ˜์ค€์˜ ํ‘œํ˜„์ด ํ•„์š”

  2. ๋žญํ‚น์„ ๋งค๊ธฐ๋Š” ๋‹จ๊ณ„
    a. ์•„์ดํ…œ๊ณผ ์‚ฌ์šฉ์ž๋ฅผ ์„ค๋ช…ํ•˜๋Š” Feature๋ฅผ ์‚ฌ์šฉํ•ด ์›ํ•˜๋Š” ๋ชฉ์  ํ•จ์ˆ˜์— ๋”ฐ๋ผ ๊ฐ ์•„์ดํ…œ์— ์ ์ˆ˜๋ฅผ ํ• ๋‹นํ•ด ๊ฐ€์žฅ ๋†’์€ ์ ์ˆ˜๋ฅผ ๋ฐ›์€ ์•„์ดํ…œ์ด ์ ์ˆ˜์— ๋”ฐ๋ผ ์ˆœ์œ„๊ฐ€ ๋งค๊ฒจ์ ธ ์‚ฌ์šฉ์ž์—๊ฒŒ ํ‘œ์‹œ

๋Œ€ํ‘œ์ ์ธ ์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜

  • Collaborative Filtering : ํ˜‘์—… ํ•„ํ„ฐ๋ง
  • Content-based Recommender Systems : ์ปจํ…์ธ  ๊ธฐ๋ฐ˜ ์ถ”์ฒœ ์‹œ์Šคํ…œ
  • Knowledge-based Systems : ์ง€์‹ ๊ธฐ๋ฐ˜ ์ถ”์ฒœ ์‹œ์Šคํ…œ

์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋ชฉํ‘œ

  • Prediction version of Problem(=Matrix Completion Problem)
    • ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด ์œ ์ €์˜ ์„ ํ˜ธ๋„๋ฅผ ์ •ํ™•ํžˆ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ๋ชฉ์ 
    • ์ผ๋ฐ˜์ ์ธ ๋ชฉ์ (์ด๋ฅผ ํ•ด๊ฒฐํ•ด์•ผ Ranking ๋ฌธ์ œ๋„ ํ•ด๊ฒฐ ๊ฐ€๋Šฅํ•˜๊ธฐ ๋•Œ๋ฌธ)
  • Ranking version of Problem
    • ์ •ํ™•ํ•œ ์ˆ˜์น˜ ์˜ˆ์ธก์ด ์•„๋‹Œ, ๋žญํ‚น์„ ๊ณ ๋ คํ•œ top-k ์•„์ดํ…œ ์„ ์ •์ด ๋ชฉ์ 
    • ํ˜„์‹ค์ ์œผ๋กœ ์ƒํ’ˆ์„ ๊ณ ๋ฅผ ๋•Œ ์ƒ๋Œ€์ ์œผ๋กœ ๋” ์ข‹์€ ๊ฒƒ์„ ๊ตฌ๋งคํ•จ(์ ์ˆ˜๋ฅผ ๊ฐ๊ฐ ์ •ํ™•ํžˆ ๊ตฌํ•ด์„œ ๋น„๊ตํ•˜๋Š” ๊ฒƒ์€ ์•„๋‹ˆ๊ธฐ ๋•Œ๋ฌธ์—) Ranking ๋ฌธ์ œ๊ฐ€ ๋” ์ž์—ฐ์Šค๋Ÿฌ์›€

Basic Models of Recommender Systems

  • ํ˜‘์—… ํ•„ํ„ฐ๋ง(Collaborative Filtering)

    • ๋‘ ๋ช…์˜ ์‚ฌ์šฉ์ž๊ฐ€ ๋น„์Šทํ•œ ๊ด€์‹ฌ์‚ฌ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค๋ฉด, ํ•œ ์‚ฌ์šฉ์ž์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋‹ค๋ฅธ ์‚ฌ์šฉ์ž์—๊ฒŒ ์ถ”์ฒœํ•˜๋Š” ๋ฐฉ์‹

    • ์‚ฌ์šฉ์ž ๊ฐ„์˜ ์„ ํ˜ธ๋„๋ฅผ ์„œ๋กœ ๊ณ ๋ คํ•ด ๋งŽ์€ ์„ ํƒ์‚ฌํ•ญ๋“ค๋กœ๋ถ€ํ„ฐ ์•„์ดํ…œ์„ ๊ฑธ๋Ÿฌ๋‚ด๊ฑฐ๋‚˜ ์„ ํƒ


    • Memory-Based methods

      • neighborhood - based collaborative filtering algorithms ๋ผ๊ณ ๋„ ๋ถˆ๋ฆผ.
      • ์‚ฌ์šฉ์ž ๊ธฐ๋ฐ˜ ์ถ”์ฒœ(User-based collaborative filtering)
        • ์œ ์ € ๊ฐ„์˜ ์œ ์‚ฌ๋„๊ฐ€ ๋†’์„์ˆ˜๋ก ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌ
        • ๊ฐ™์€ ๊ทธ๋ฃน์˜ ๋‹ค๋ฅธ ์œ ์ €๊ฐ€ ์„ ํ˜ธํ•˜๋Š” ์•„์ดํ…œ์„ ์ถ”์ฒœ
        • ์ผ๋ฐ˜์ ์œผ๋กœ ํŠน์ • A์™€ ์œ ์‚ฌํ•œ Top K์˜ ์œ ์‚ฌํ•œ ์œ ์ €๋“ค๋กœ ๊ทธ๋ฃน์„ ๊ตฌ์„ฑํ•ด ์„ ํ˜ธ ์•„์ดํ…œ ์ถ”์ฒœ
      • ์•„์ดํ…œ ๊ธฐ๋ฐ˜ ์ถ”์ฒœ (Item-based collaborative filtering)
        • B๋ผ๋Š” ์•„์ดํ…œ์— ๋Œ€ํ•œ ์œ ์ €์˜ ์„ ํ˜ธ๋„๋ฅผ ์˜ˆ์ธก์„ ์œ„ํ•ด B์™€ ๊ฐ€์žฅ ์œ ์‚ฌํ•œ Top K ์•„์ดํ…œ์„ ์„ ์ •ํ•˜์—ฌ Item Set์„ ๊ตฌ์„ฑ

    • Model - based methods

      • ๋จธ์‹ ๋Ÿฌ๋‹, ๋ฐ์ดํ„ฐ ๋งˆ์ด๋‹ ๋ฐฉ๋ฒ•์—์„œ์˜ ์˜ˆ์ธก ๋ชจ๋ธ context๋ฅผ ๊ธฐ๋ฐ˜ํ•œ ๋ฐฉ๋ฒ•
      • ๋ชจ๋ธ์ด ํŒŒ๋ผ๋ฏธํ„ฐํ™”๋˜์–ด ์žˆ๋‹ค๋ฉด, ์ด ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” context ๋‚ด์—์„œ ํ•™์Šต๋จ.

    • ํ˜‘์—… ํ•„ํ„ฐ๋ง์˜ ํ•œ๊ณ„์ 
      ์ฝœ๋“œ ์Šคํƒ€ํŠธ

      • ์•ž์˜ ๊ฒฐ๊ณผ๋ฅผ ์ด์šฉํ•ด ๋™์ž‘ํ•˜๋ฏ€๋กœ ๋ฐ์ดํ„ฐ๊ฐ€ ์—†๋Š” ์ƒํƒœ์—์„œ๋Š” ์ œ๋Œ€๋กœ ๋™์ž‘ํ•˜์ง€ ์•Š์Œ.
      • ํ˜‘์—… ํ•„ํ„ฐ๋ง์€ ์‚ฌ์šฉ์ž๋“ค์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์‹ ๊ทœ ์‚ฌ์šฉ์ž์—๊ฒŒ๋Š” ์•„๋ฌด๋Ÿฐ ์ •๋ณด๊ฐ€ ์—†์–ด ์ถ”์ฒœํ•  ์ˆ˜ ์—†๋Š” ์ƒํ™ฉ ๋ฐœ์ƒ

      ๊ณ„์‚ฐ ํšจ์œจ ์ €ํ•˜

      • ํ˜‘์—… ํ•„ํ„ฐ๋ง์€ ์ƒ๋‹นํžˆ ๋งŽ์€ ๊ณ„์‚ฐ๋Ÿ‰์„ ์š”๊ตฌ -> ์‚ฌ์šฉ์ž ์ˆ˜ ์ฆ๊ฐ€ -> ๊ณ„์‚ฐ ์‹œ๊ฐ„์ด ๋”์šฑ ๊ธธ์–ด์ง
      • ์‚ฌ์šฉ์ž ์ˆ˜๊ฐ€ ๋งŽ์•„ ๋ฐ์ดํ„ฐ๊ฐ€ ์Œ“์ด๊ฒŒ ๋˜๋ฉด ์ •ํ™•๋„๋Š” ๋†’์ผ ์ˆ˜ ์žˆ์œผ๋‚˜, ๊ทธ๋งŒํผ ์‹œ๊ฐ„์ด ์˜ค๋ž˜ ๊ฑธ๋ ค ํšจ์œจ์„ฑ์ด ๋–จ์–ด์ง.

      ๋กฑํ…Œ์ผ

      • ์‚ฌ์šฉ์ž๋“ค์ด ์†Œ์ˆ˜์˜ ์ธ๊ธฐ ์žˆ๋Š” ํ•ญ๋ชฉ์—๋งŒ ๊ด€์‹ฌ์„ ๋ณด์—ฌ ๊ด€์‹ฌ์ด ์ €์กฐํ•œ ํ•ญ๋ชฉ์€ ์ถ”์ฒœ๋˜์ง€ ๋ชปํ•˜๋Š” ๋ฌธ์ œ์  ๋ฐœ์ƒ
      • ์ฆ‰, ์†Œ์ˆ˜์˜ ์ธ๊ธฐ ์ปจํ…์ธ ๊ฐ€ ์ „์ฒด ์ปจํ…์ธ  ๋น„์œจ์„ ์ฐจ์ง€ํ•˜๋Š” ํ˜„์ƒ์ด ๋‚˜ํƒ€๋‚จ.

  • ์ปจํ…์ธ  ๊ธฐ๋ฐ˜ ํ•„ํ„ฐ๋ง(Content based Filtering)

    • ์‚ฌ์šฉ์ž๊ฐ€ ๊ณผ๊ฑฐ์— ๊ฒฝํ—˜ํ–ˆ๋˜ ์•„์ดํ…œ ์ค‘ ๋น„์Šทํ•œ ์•„์ดํ…œ์„ ํ˜„์žฌ ์‹œ์ ์—์„œ ์ถ”์ฒœํ•˜๋Š” ๊ฒƒ

    • ์ •๋ณด๋ฅผ ์ฐพ๋Š” ๊ณผ์ •๊ณผ ๊ณผ๊ฑฐ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•ด ์œ ์ €์˜ ์„ฑํ–ฅ์„ ๋ฐฐ์šฐ๋Š” ๋ฌธ์ œ์ž„.


    • ๋ฐ์ดํ„ฐ ํš๋“ ํ›„, ์ปจํ…์ธ  ๋ถ„์„์—์„œ ๋น„์ •ํ˜• ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๊ด€๋ จ ์žˆ๋Š” ์ •๋ณด๋ฅผ ์–ป๋Š” ์ž‘์—…์ด ํ•„์š”

      • feature extraction, vector representation ๋“ฑ์˜ ์ž‘์—…์„ ์ˆ˜ํ–‰
      • ์œ ์ €๊ฐ€ ์„ ํ˜ธํ•˜๋Š” ์•„์ดํ…œ๊ณผ ์ทจํ–ฅ์„ ํŒŒ์•…ํ•˜๋Š” ์œ ์ € ํ”„๋กœํ•„ ํŒŒ์•…
      • cosine ์œ ์‚ฌ๋„๋ฅผ ์ด์šฉํ•˜์—ฌ ์œ ์‚ฌ ์•„์ดํ…œ ์„ ํƒ

    • ์ปจํ…์ธ  ๊ธฐ๋ฐ˜ ํ•„ํ„ฐ๋ง์˜ ํ•œ๊ณ„์ 
      ๋‹ค์–‘ํ•œ ํ•ญ๋ชฉ ์ถ”์ฒœ์— ๋Œ€ํ•œ ์–ด๋ ค์›€
      • ์ฝ˜ํ…์ธ  ๊ธฐ๋ฐ˜ ํ•„ํ„ฐ๋ง์€ ๋‚ด์šฉ ์ž์ฒด๋ฅผ ๋ถ„์„ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ˜‘์—… ํ•„ํ„ฐ๋ง์˜ ํ•œ๊ณ„์ธ ์ฝœ๋“œ ์Šคํƒ€ํŠธ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์Œ.
      • ๊ทธ๋Ÿฌ๋‚˜ ์Œ์•…, ์‚ฌ์ง„, ์˜์ƒ์„ ๋™์‹œ์— ์ถ”์ฒœํ•ด์•ผํ•œ๋‹ค๊ณ  ํ•  ๋•Œ, ๊ฐ ํ•ญ๋ชฉ์—์„œ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ๋Š” ์ •๋ณด๊ฐ€ ๋‹ค ๋‹ค๋ฅด๋‹ค๋ณด๋‹ˆ ํ”„๋กœํŒŒ์ผ์„ ๊ตฌ์„ฑ์— ์–ด๋ ค์›€์ด ์žˆ์Œ.
      • ์ฆ‰, ๋‹ค์–‘ํ•œ ํ•ญ๋ชฉ ์ถ”์ฒœ์—๋Š” ๋‹ค์†Œ ์–ด๋ ค์›€์ด ์žˆ์Œ.

  • ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ํ•„ํ„ฐ๋ง(Hybrid Filtering)
    • ํ˜‘์—… ํ•„ํ„ฐ๋ง + ์ง€์‹ ๊ธฐ๋ฐ˜ ์ถ”์ฒœ ์‹œ์Šคํ…œ ๊ฒฐํ•ฉ


Day2


โ˜๏ธ ์ถ”์ฒœ ์‹œ์Šคํ…œ์„ ์„ค๊ณ„ํ•˜๋Š” ๋ฐฉ๋ฒ•

๐Ÿ”Ž ํ˜„์žฌ ๊ธฐ์‚ฌ์™€ ๊ด€๋ จ ์žˆ๋Š” ๊ธฐ์‚ฌ๋ฅผ ์ž๋™์œผ๋กœ ๋ฐฐ์น˜ํ•ด์ฃผ๋Š” ์‹œ์Šคํ…œ์„ ์„ค๊ณ„ํ•˜๋ผ.

Hint : ๊ธฐ์ˆ ์  ํ•ต์‹ฌ์€ ๊ด€๋ จ ์žˆ๋Š” ๊ธฐ์‚ฌ์˜ ๋ฆฌ์ŠคํŠธ๋ฅผ ์•Œ์•„๋‚ด๋Š” ๊ฒƒ

  • Solution 01
    • ์ตœ๊ทผ์— ์ธ๊ธฐ ์žˆ๋Š” ๋ฌธ์„œ๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ
      • Jingle์—์„œ ์‚ฌ๋žŒ์„ ๊ณ ์šฉํ•ด ์ค‘์š”ํ•ด ๋ณด์ด๋Š” ๋ฌธ์„œ์— ํƒœ๊ทธ๋ฅผ ๋‹ฌ๋„๋ก ํ•  ์ˆ˜ ์žˆ์Œ.
      • ๊ทธ ์™ธ์—๋„ ๊ธˆ์œต, ์Šคํฌ์ธ , ์ •์น˜์™€ ๊ด€๋ จ๋œ ํƒœ๊ทธ๋ฅผ ๋ฌธ์„œ์— ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Œ.
      • ํƒœ๊ทธ -> HTML์˜ ๋ฉ”ํƒ€ ํƒœ๊ทธ ํ˜น์€ ํŽ˜์ด์ง€ ์ œ๋ชฉ์—์„œ ์–ป์–ด์˜ฌ ์ˆ˜๋„ ์žˆ์Œ.

  • Solution 02
    • ์ตœ๊ทผ ๋‰ด์Šค ๊ธฐ์‚ฌ๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ
      • ์ž„์˜๋กœ ์„ ํƒํ•œ ๊ธฐ์‚ฌ๋ฅผ ์ž„์˜์˜ ๋…์ž์—๊ฒŒ ๋ณด๋‚ธ ๋’ค ํ•ด๋‹น ๊ธฐ์‚ฌ๋“ค์ด ์–ผ๋งˆ๋‚˜ ์ธ๊ธฐ ์žˆ๋Š”์ง€ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Œ.
      • ์ธ๊ธฐ ์žˆ๋Š” ๊ธฐ์‚ฌ๋Š” ๋” ์ž์ฃผ ์ฝํž ๊ฒƒ์ž„.

  • Solution 03
  • ์ž๋™ ๋ฌธ์„œ ๋ถ„์„(automatic textual analysis)
    • ์ข€ ๋” ์ •๊ตํ•œ ๋ฐฉ์‹, ๋‘ ์Œ์˜ ๊ธฐ์‚ฌ๊ฐ€ ์–ผ๋งˆ๋‚˜ ๋น„์Šทํ•œ์ง€ ๊ฐ’์„ ๊ตฌํ•ด ์‚ฌ์šฉ
    • ๋น„์Šทํ•œ ์ •๋„ = ์‹ค์ˆ˜ ๊ฐ’
    • ๋‘ ๊ธฐ์‚ฌ ์‚ฌ์ด ๊ณตํ†ต๋œ ๋‹จ์–ด๊ฐ€ ์–ผ๋งˆ๋‚˜ ๋งŽ์€์ง€๋ฅผ ์ธก์ •
    • ๊ณ ๋ คํ•  ์ 
      • "for"๋‚˜ "the"์™€ ๊ฐ™์ด ํ”ํ•˜๊ฒŒ ๋“ฑ์žฅํ•˜๋Š” ๋‹จ์–ด ๋ฌด์‹œ
      • ์ž์ฃผ ๋“ฑ์žฅํ•˜์ง€ ์•Š๋Š” "arbitrage"๋‚˜ "diesel"๊ณผ ๊ฐ™์€ ๋‹จ์–ด๋Š” "sale"์ด๋‚˜ "international"๊ณผ ๊ฐ™์ด ํ”ํ•œ ๋‹จ์–ด๋ณด๋‹ค๋Š” ๋” ์ค‘์š”ํ•˜๊ฒŒ ๋‹ค๋ค„์ ธ์•ผ ํ•จ.

  • ๋ฌธ์„œ ๋ถ„์„ ์‹œ, ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ๋ฌธ์ œ๋ฅผ ์ง๋ฉดํ•˜๊ฒŒ ๋  ๊ฒƒ.

    • ๋‘ ๋‹จ์–ด์˜ ์ŠคํŽ ๋ง์ด ๊ฐ™์•„๋„ ๋œป์ด ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์Œ
      • ๋ฌธ๋งฅ์— ๋”ฐ๋ผ, ์ฆ‰ ์ฃผ์ œ๊ฐ€ AIDS์ธ์ง€ ์ปดํ“จํ„ฐ ๋ณด์•ˆ์ธ์ง€์— ๋”ฐ๋ผ anti-virus์˜ ๋œป์ด ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์Œ.
      • ์ด๋Ÿฌํ•œ ์ƒํ™ฉ์—์„œ, ๋งŽ์€ ์‚ฌ์šฉ์ž์˜ ์ •๋ณด๋ฅผ ๋ชจ์•„ ํ˜‘์—… ํ•„ํ„ฐ๋ง(collaborative filtering) ์„ ์ ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์Œ.
  • ์›น ์„œ๋ฒ„ ๋กœ๊ทธ ํŒŒ์ผ์˜ ์ฟ ๊ธฐ์™€ ํƒ€์ž„์Šคํƒฌํ”„๋ฅผ ์‚ดํŽด๋ณด๊ธฐ

    • ์‚ฌ์šฉ์ž๊ฐ€ ์–ด๋–ค ๊ธฐ์‚ฌ๋ฅผ ์ฝ์—ˆ๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ์Œ.
    • ๊ฐ™์€ ์„ธ์…˜ ์•ˆ์—์„œ A์™€ B๋ฅผ ํ•จ๊ป˜ ์ฝ์€ ์‚ฌ์šฉ์ž๊ฐ€ ๋งŽ๋‹ค๋ฉด, A๋ฅผ ์ฝ์€ ์‚ฌ์šฉ์ž์—๊ฒŒ B๋ฅผ ์ถ”์ฒœํ•ด๋„ ๋  ๊ฒƒ.



Day3


โ˜๏ธ ์Šคํ”„๋ง๋ถ€ํŠธ๋ฅผ ํ™œ์šฉํ•˜๋Š” ์ถ”์ฒœ ์‹œ์Šคํ…œ ์„ค๊ณ„ ์ฐพ์•„๋ณด๊ธฐ

์ฐธ๊ณ  ๊ต์žฌ : ์ถ”์ฒœ ์‹œ์Šคํ…œ ์ž…๋ฌธ

์žฅ๋ฅด ๊ธฐ๋ฐ˜ ์ถ”์ฒœ ์‹œ์Šคํ…œ
1๏ธโƒฃ[SpringBoot] ์žฅ๋ฅด๊ธฐ๋ฐ˜ ๊ฐ„๋‹จํ•œ ์˜ํ™” ์ถ”์ฒœ API ์„ค๊ณ„ํ•˜๊ธฐ
2๏ธโƒฃ [ํ† ์ด ํ”„๋กœ์ ํŠธ] ์˜ํ™” ์ถ”์ฒœ
3๏ธโƒฃ ์Šคํ”„๋ง๋ถ€ํŠธ์™€ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์ด์šฉํ•œ ์ถ”์ฒœ ์—”์ง„ ๊ตฌํ˜„
4๏ธโƒฃ ํ˜‘์—… ํ•„ํ„ฐ๋ง

์‹ค์‹œ๊ฐ„ ์ถ”์ฒœ ์‹œ์Šคํ…œ
1๏ธโƒฃ ์‹ค์‹œ๊ฐ„ ์ถ”์ฒœ์‹œ์Šคํ…œ ๊ตฌํ˜„๋ฐฉ๋ฒ• ๋ฌธ์˜



Day4


โ˜๏ธ ์Šคํ”„๋ง๋ถ€ํŠธ๋ฅผ ํ™œ์šฉํ•œ ์˜ํ™” ์ถ”์ฒœ API๋ฅผ ์„ค๊ณ„ํ•ด๋ณด์ž.

์ถœ์ฒ˜

SpringBoot ์˜ํ™” ์ถ”์ฒœ API ์„ค๊ณ„ ๊ณผ์ •

์ฝ”๋“œ ๋ฐ ๋‚ด์šฉ์— ๋Œ€ํ•œ ์ถœ์ฒ˜๋Š” ์œ„์˜ 1, 2๋ฒˆ ๋งํฌ์— ์žˆ์Šต๋‹ˆ๋‹ค!

์‹œ๋‚˜๋ฆฌ์˜ค

  • MovieLense ๋ฐ์ดํ„ฐ ํ™œ์šฉ -> ๋ฐ์ดํ„ฐ๋ฅผ MySQL์— ๋ชจ๋‘ ์ €์žฅ
  • ์‚ฌ์šฉ์ž์—๊ฒŒ ์ „์ฒด ์˜ํ™” ๋ฆฌ์ŠคํŠธ ๋„˜๊ธฐ๊ธฐ
  • ์‚ฌ์šฉ์ž๋Š” ์„œ๋ฒ„์— 10๊ฐœ์˜ ์˜ํ™”๋ฅผ ์„ ํƒํ•ด ๋‹ค์‹œ ์ „์†ก
  • ์‚ฌ์šฉ์ž๊ฐ€ ์„ ํƒํ•œ ์˜ํ™”์—์„œ ์žฅ๋ฅด๋งŒ ์ถ”์ถœ(๊ฐ€์žฅ ๋งŽ์ด ์„ ํƒ๋œ ์žฅ๋ฅด 2๊ฐœ ์„ ์ •)
  • ์ „์ฒด ์˜ํ™” ๋ฆฌ์ŠคํŠธ ์ค‘ ์„ ํƒ๋œ ์žฅ๋ฅด 2๊ฐœ๊ฐ€ ํฌํ•จ๋œ ์˜ํ™”๋“ค์„ ์ถ”์ฒœ ์˜ํ™”๋กœ ๋ฐ˜ํ™˜

Dependency

  • ํ•„์š”ํ•œ ์˜์กด์„ฑ ์ฃผ์ž…
    • ํ•„์ˆ˜ : Spring data JPA, QueryDSL, MySQL

MovieLens ๋ฐ์ดํ„ฐ


์˜ํ™” ๋ฐ์ดํ„ฐ ์ ์žฌ

  • ์—”ํ‹ฐํ‹ฐ ์„ค์ •

    • id : ๊ธฐ๋ณธํ‚ค
    • title : ์˜ํ™” ์ œ๋ชฉ
    • tid : tmdb Id(Frontend ์ธก ๊ณ ๋ ค)
    • genres : Genres๋ฅผ set ํ˜•ํƒœ๋กœ ๊ฐ€์ง€๊ณ  ์žˆ๋„๋ก ์„ค์ •

    ๐Ÿ˜Š MovieEntity

    • ์˜ํ™” ์—”ํ‹ฐํ‹ฐ

    ๐Ÿ˜Š GenresEntity

    • ์žฅ๋ฅด ์—”ํ‹ฐํ‹ฐ

  • ์ปจํŠธ๋กค๋Ÿฌ
    ๐Ÿ˜Š dummyController

    • ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ๋ชจ๋“  ์˜ํ™” ์ •๋ณด๋ฅผ ๊ธฐ๋กํ•˜๋Š” ๊ณผ์ •
    • movies.csv ํŒŒ์ผ์„ ์ฝ์–ด line ๊ธฐ์ค€์œผ๋กœ ํŒŒ์‹ฑํ•˜๊ธฐ
    • movieId, title์˜ ๊ฒฝ์šฐ ','๋ฅผ ๊ธฐ์ค€์œผ๋กœ split
    • ํŠนํžˆ, ์žฅ๋ฅด์˜ ๊ฒฝ์šฐ | ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋‚˜๋‰จ.
    //์ฒซ๋ฒˆ์งธ ๋ผ์ธ ์ œ์™ธ -> skipFirstLine ๋ณ€์ˆ˜ ์‚ฌ์šฉ
    movieRepo.save(Movies.builder()
                      .id(movieId).tId(MovieIdToTid.get(movieId))
                      .title(title.toString())
                      .genres(Arrays.stream(genre)
                              .map(genreService::findOrCreateNew)
                              .collect(Collectors.toSet()))
                      .build());
    //findOrCreateNew ๋ฉ”์„œ๋“œ -> ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ํ•ด๋‹น ์ด๋ฆ„์˜ ์žฅ๋ฅด๊ฐ€ ์กด์žฌํ•˜๋ฉด ๊ทธ ์žฅ๋ฅด ์‚ฌ์šฉ or ์—†๋‹ค๋ฉด ์ƒˆ๋กœ ์ €์žฅ
    package com._chanho.movie_recommendation.genre;
    
    import lombok.RequiredArgsConstructor;
    import org.springframework.stereotype.Service;
    
    @Service
    @RequiredArgsConstructor
    public class GenreService {
        private final GenresRepo genresRepo;
    
        public Genres findOrCreateNew(String name) {
            return genresRepo.findByName(name).orElseGet(
                    () -> genresRepo.save(new Genres(name))
            );
        }
    }
    //movieId -> tmdbId ๋ณ€ํ™˜์„ ์œ„ํ•œ ๋นˆ ๋“ฑ๋ก
    package com._chanho.movie_recommendation.api;
    
    import lombok.RequiredArgsConstructor;
    import org.springframework.context.annotation.Bean;
    import org.springframework.context.annotation.Configuration;
    
    import java.io.*;
    import java.util.HashMap;
    import java.util.Map;
    
    @Configuration
    @RequiredArgsConstructor
    public class AppConfig {
    
        @Bean
        public Map<Long, Long> MovieIdToTid() {
            Map<Long, Long> movieIdToTid = new HashMap<>();
    
            File csv = new File("{your_path}\\links.csv");
            BufferedReader br = null;
            try {
                br = new BufferedReader(new BufferedReader(new FileReader(csv)));
            } catch (FileNotFoundException e) {
                e.printStackTrace();
            }
    
            String line = "";
            boolean skipFirstLine = true;
            while (true) {
                try {
                    assert br != null;
                    if ((line = br.readLine()) == null) break;
                } catch (IOException e) {
                    e.printStackTrace();
                }
    
                if (skipFirstLine) {
                    skipFirstLine = false;
                    continue;
                }
    
                String[] token = line.split(",");
    
                if(token.length > 2) {
                    Long movieId = Long.parseLong(token[0]);
                    Long tId = Long.parseLong(token[2]);
    
                    movieIdToTid.put(movieId, tId);
                }
    
            }
            return movieIdToTid;
        }
    }

  • link.csv
    • movieId์— ํ•ด๋‹นํ•˜๋Š” tmdbId๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ๋„ ์žˆ์–ด, ์žˆ๋Š” ๊ฒฝ์šฐ๋งŒ ๋ฐ˜ํ™˜ํ•˜๊ธฐ ์œ„ํ•ด ์˜ˆ์™ธ ์ฒ˜๋ฆฌ ํ•„์š”

๐Ÿ˜Š MovieController

  • ์ „์ฒด ์˜ํ™”๋ฅผ ๋ฐ˜ํ™˜ํ•ด์ฃผ๋Š” ์ปจํŠธ๋กค๋Ÿฌ ์ž‘์„ฑ

  • Pagenation์„ ํ†ตํ•ด 100๊ฐœ์”ฉ ์ž๋ฅด๊ธฐ

    package com._chanho.movie_recommendation.movie;
    
    import com._chanho.movie_recommendation.genre.Genres;
    import lombok.RequiredArgsConstructor;
    import lombok.extern.slf4j.Slf4j;
    import org.springframework.data.domain.Page;
    import org.springframework.data.domain.Pageable;
    import org.springframework.data.web.PageableDefault;
    import org.springframework.http.HttpStatus;
    import org.springframework.http.ResponseEntity;
    import org.springframework.web.bind.annotation.*;
    
    import javax.persistence.criteria.CriteriaBuilder;
    import java.util.*;
    import java.util.stream.Collectors;
    
    @Slf4j
    @RestController
    @RequiredArgsConstructor
    @RequestMapping("/api/movie")
    public class MovieController {
        private final MovieRepo movieRepo;
        private final MovieService movieService;
    
        @GetMapping("/movies")
        public ResponseEntity retrieveMovies(@PageableDefault(size = 100) Pageable pageable) {
            return movieService.retrieveMovies(pageable);
        }
    
        @PostMapping("/recommendation")
        public List<Movies> getRecommendation(@RequestBody RecommendationDto recommendationDto) {
            HashMap<Genres, Integer> pickedGenres = movieService.getPickedGenres(recommendationDto);
            HashMap<Genres, Integer> pickedGenresWithSort = movieService.sortByValue(pickedGenres);
            Set<Genres> selectBestInKeys = movieService.selectKeyInMap(pickedGenresWithSort);
            return movieRepo.findByGenres(selectBestInKeys, recommendationDto);
        }
    }

  • ์„œ๋น„์Šค
    ๐Ÿ˜Š MovieService

    import org.springframework.stereotype.Service;
    import org.springframework.web.bind.annotation.GetMapping;
    import org.springframework.web.bind.annotation.RequestMapping;
    
    import javax.transaction.Transactional;
    import java.util.*;
    import java.util.stream.Collectors;
    
    @Slf4j
    @Service
    @Transactional
    @RequiredArgsConstructor
    public class MovieService {
    
        private final MovieRepo movieRepo;
        private final GenresRepo genresRepo;
    
        public ResponseEntity retrieveMovies(Pageable pageable) {
            Page<Movies> moviesPage = movieRepo.findAll(pageable);
            return new ResponseEntity<>(moviesPage, HttpStatus.OK);
        }
    
        public HashMap<Genres, Integer> sortByValue(HashMap<Genres, Integer> raw) {
            return raw.entrySet()
                    .stream()
                    .sorted((i1, i2) -> i1.getValue().compareTo(i2.getValue()))
                    .collect(Collectors.toMap(
                            Map.Entry::getKey,
                            Map.Entry::getValue,
                            (e1, e2) -> e2, LinkedHashMap::new
                    ));
        }
    
        public HashMap<Genres, Integer> getPickedGenres(RecommendationDto recommendationDto) {
            HashMap<Genres, Integer> pickedGenres = new HashMap<>();
            recommendationDto.getPickedMovies().forEach(
                    movieData -> {
                        Movies movie = movieRepo.findById(movieData.getMovieId()).orElseThrow(
                                () -> new IllegalStateException("Cannot find Movies with given id: " + movieData.getMovieId().toString()));
    
                        Set<Genres> genresList = movie.getGenres();
                        for(Genres g : genresList) {
                            Integer count = pickedGenres.getOrDefault(g, 0);
                            pickedGenres.put(g, count);
                        }
                    }
            );
    
            return pickedGenres;
        }
    
        public Set<Genres> selectKeyInMap(HashMap<Genres, Integer> pickedGenresWithSort) {
            Iterator<Genres> keys = pickedGenresWithSort.keySet().iterator();
            Set<Genres> selectBestInKeys = new HashSet<>();
            int count = 0;
            while(keys.hasNext() && count < 2) {
                Genres genres = keys.next();
                selectBestInKeys.add(genres);
                count++;
            }
    
            return selectBestInKeys;
        }
    }

  • DTO
    ๐Ÿ˜Š RecommendationDTO

    • sortByValue๋Š” Value๋ฅผ ๊ธฐ์ค€์œผ๋กœ Map ๊ฐ’์„ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌํ•˜๋Š” ๋ฉ”์„œ๋“œ
    • ๊ฐ€์žฅ ๋งŽ์ด ์„ ํƒ๋œ ์žฅ๋ฅด 2๊ฐ€์ง€๋งŒ ์‚ฌ์šฉํ•ด ๋งค์นญ๋˜๋Š” ์˜ํ™” ์ •๋ณด ๊ฐ€์ ธ์˜ค๊ธฐ
    • selectKeyInMap์€ ์ •๋ ฌ๋œ Map์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ์„ ํƒ๋œ ์žฅ๋ฅด 2๊ฐœ๋ฅผ ๋„ฃ์–ด ๋ฐ˜ํ™˜ํ•ด์ฃผ๋Š” ๋ฉ”์„œ๋“œ
    package com._chanho.movie_recommendation.movie;
    
    import lombok.AllArgsConstructor;
    import lombok.Builder;
    import lombok.Data;
    import lombok.NoArgsConstructor;
    
    import java.util.List;
    
    @Data @Builder
    @NoArgsConstructor @AllArgsConstructor
    public class RecommendationDto {
        private Long userId;
        private List<MovieData> pickedMovies;
    }
    
    @Data @Builder
    class MovieData{
        private Long tId;
        private Long movieId;
        private Double rating;
    }

  • Repository

    • QueryDSL ์ ์šฉ

    • 2๊ฐœ์˜ Predacate ๋งŒ๋“ค๊ธฐ

      1. ์„ ํƒ๋œ 2๊ฐœ์˜ Genres๊ฐ€ ๋ชจ๋‘ ํฌํ•จ๋˜์–ด ์žˆ๋Š”์ง€?
      1. ์ด๋ฏธ recommendationDTO์— ํฌํ•จ๋œ ์˜ํ™”๊ฐ€ ์•„๋‹Œ์ง€?
    • Genres table๊ณผ left join ํ•ด์„œ fetchํ•˜๊ธฐ

      ๐Ÿ˜Š MovieRepo

      package com._chanho.movie_recommendation.movie;
      
      import org.springframework.data.domain.Page;
      import org.springframework.data.domain.Pageable;
      import org.springframework.data.jpa.repository.EntityGraph;
      import org.springframework.data.jpa.repository.JpaRepository;
      import org.springframework.stereotype.Repository;
      
      import javax.transaction.Transactional;
      
      @Repository
      public interface MovieRepo extends JpaRepository<Movies, Long>, MovieRepoExtension {
      
          @EntityGraph(value = "Movies.withGenres", type= EntityGraph.EntityGraphType.FETCH)
          Page<Movies> findAll(Pageable pageable);
      }

      ๐Ÿ˜Š MovieRepoExtension

      package com._chanho.movie_recommendation.movie;
      
      import com._chanho.movie_recommendation.genre.Genres;
      
      import java.util.List;
      import java.util.Set;
      
      public interface MovieRepoExtension {
          List<Movies> findByGenres(Set<Genres> genres, RecommendationDto recommendationDto);
      }

      ๐Ÿ˜Š MovieRepoExtensionImpl

      package com._chanho.movie_recommendation.movie;
      
      import com._chanho.movie_recommendation.genre.Genres;
      import com._chanho.movie_recommendation.genre.QGenres;
      import com.querydsl.core.BooleanBuilder;
      import com.querydsl.jpa.JPQLQuery;
      import org.springframework.data.jpa.repository.support.QuerydslRepositorySupport;
      
      import java.util.List;
      import java.util.Set;
      
      public class MovieRepoExtensionImpl extends QuerydslRepositorySupport implements MovieRepoExtension {
          public MovieRepoExtensionImpl() {
              super(Movies.class);
          }
      
          @Override
          public List<Movies> findByGenres(Set<Genres> genres, RecommendationDto recommendationDto) {
              com._chanho.movie_recommendation.movie.QMovies movies
                      = com._chanho.movie_recommendation.movie.QMovies.movies;
      
              BooleanBuilder containGenres = new BooleanBuilder();
              genres.forEach(genre -> {
                  containGenres.and(movies.genres.contains(genre));
              });
      
              BooleanBuilder notInRecommendation = new BooleanBuilder();
              recommendationDto.getPickedMovies().forEach(movieData -> {
                  notInRecommendation.and(movies.id.notIn(movieData.getMovieId()));
              });
      
              JPQLQuery<Movies> query = from(movies)
                      .where(containGenres)
                      .where(notInRecommendation)
                      .leftJoin(movies.genres, QGenres.genres).fetchJoin()
                      .distinct().limit(10);
      
              return query.fetch();
          }
      }


์ถ”์ฒœ ์—”์ง„ ํฌ์ปค์Šค ์„ค๋ช…

๐Ÿ”Ž ์Šคํ”„๋ง๋ถ€ํŠธ์™€ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์ด์šฉํ•œ ์ถ”์ฒœ ์—”์ง„ ๊ตฌํ˜„ํ•˜๊ธฐ

์ฝ”๋“œ ๋ฐ ๋‚ด์šฉ์— ๋Œ€ํ•œ ์ถœ์ฒ˜๋Š” ์œ„์˜ 3๋ฒˆ ๋งํฌ์— ์žˆ์Šต๋‹ˆ๋‹ค!

์ถ”์ฒœ ์—”์ง„?

  • ์‚ฌ์šฉ์ž์—๊ฒŒ ๋งž์ถคํ˜• ์ถ”์ฒœ์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ๋ฐ์ดํ„ฐ ๋ถ„์„ ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ์‹œ์Šคํ…œ
  • ์ผ๋ฐ˜์ ์ธ ์Šคํ”„๋ง๋ถ€ํŠธ ๊ธฐ์ดˆ ์„ธํŒ…์€ ๋น„์Šทํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ณ„๋„์˜ ์„ค๋ช…์€ ์ ์ง€ ์•Š์Œ.

  • ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ ์ค€๋น„

    • ๋ฐ์ดํ„ฐ ์ค€๋น„

      • ์‚ฌ์šฉ์ž ๋ฆฌ๋ทฐ, ์ œํ’ˆ ํ‰์  ๋“ฑ์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ๋Š” CSV ํŒŒ์ผ ์ฝ์–ด์˜ค๊ธฐ
      CREATE TABLE user (
      id SERIAL PRIMARY KEY,
      name VARCHAR(100),
      email VARCHAR(100) UNIQUE
      );
      
      CREATE TABLE product (
      id SERIAL PRIMARY KEY,
      name VARCHAR(100),
      category VARCHAR(100)
      );
      
      CREATE TABLE review (
      id SERIAL PRIMARY KEY,
      user_id BIGINT REFERENCES user(id),
      product_id BIGINT REFERENCES product(id),
      rating INT,
      review_text TEXT
      );
    • ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ ์„ ํƒ

      • ํ˜‘์—… ํ•„ํ„ฐ๋ง(Collaborative Filtering) ์‚ฌ์šฉ
      • ๋Œ€ํ‘œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ๋Š” Matrix Factorization์ด ์žˆ์Œ.
      • Python์˜ Suprise ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉํ•ด ํ˜‘์—… ํ•„ํ„ฐ๋ง ๋ชจ๋ธ ํ•™์Šต
      # 1. ๋ชจ๋ธ ํ•™์Šต
      
      from surprise import Dataset, Reader, SVD
      from surprise.model_selection import train_test_split
      from surprise import accuracy
      
      # ๋ฐ์ดํ„ฐ ๋กœ๋”ฉ
      reader = Reader(rating_scale=(1, 5))
      data = Dataset.load_from_df(df[['user_id', 'product_id', 'rating']], reader)
      trainset, testset = train_test_split(data, test_size=0.25)
      
      # ๋ชจ๋ธ ํ•™์Šต
      model = SVD()
      model.fit(trainset)
      
      # ์˜ˆ์ธก ๋ฐ ํ‰๊ฐ€
      predictions = model.test(testset)
      accuracy.rmse(predictions)
      # 2. ๋ชจ๋ธ ์ €์žฅ ๋ฐ load
      import joblib
      
      # ๋ชจ๋ธ ์ €์žฅ
      joblib.dump(model, 'svd_model.pkl')
      
      # ๋ชจ๋ธ ๋กœ๋“œ
      loaded_model = joblib.load('svd_model.pkl')

  • ์Šคํ”„๋ง๋ถ€ํŠธ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜๊ณผ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ ํ†ตํ•ฉ

    • ์„œ๋น„์Šค ํด๋ž˜์Šค ์ž‘์„ฑ
    @Service
    public class RecommendationService {
    
    private final PythonInterpreter pythonInterpreter;
    
    @Autowired
    public RecommendationService() {
    // PythonInterpreter๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Python ์Šคํฌ๋ฆฝํŠธ๋ฅผ ํ˜ธ์ถœ
    pythonInterpreter = new PythonInterpreter();
    pythonInterpreter.exec("import joblib");
    pythonInterpreter.exec("model = joblib.load('path/to/svd_model.pkl')");
    }
    
    public List<Product> getRecommendations(Long userId) {
    // Python ์Šคํฌ๋ฆฝํŠธ ํ˜ธ์ถœ ๋ฐ ์ถ”์ฒœ ์ƒ์„ฑ
    pythonInterpreter.set("user_id", userId);
    pythonInterpreter.exec("recommendations = model.predict(user_id)");
    // ์ถ”์ฒœ ๋ฆฌ์ŠคํŠธ ๋ณ€ํ™˜ ๋ฐ ๋ฐ˜ํ™˜
    // ...
    return recommendations;
    }
    }
    • REST API ๊ตฌํ˜„
    @RestController
    @RequestMapping("/api/recommendations")
    public class RecommendationController {
    
    @Autowired
    private RecommendationService recommendationService;
    
    @GetMapping("/{userId}")
    public ResponseEntity<List<Product>> getRecommendations(@PathVariable Long userId) {
    List<Product> recommendations = recommendationService.getRecommendations(userId);
    return ResponseEntity.ok(recommendations);
    }
    }


Day5


โ˜๏ธ ํ˜‘์—… ํ•„ํ„ฐ๋ง์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด์ž.(1)

์ถœ์ฒ˜ : ์ถ”์ฒœ ์‹œ์Šคํ…œ ๊ธฐ๋ณธ - ํ˜‘์—… ํ•„ํ„ฐ๋ง(Collaborative Filtering) - โ‘ 

  • ํ˜‘์—… ํ•„ํ„ฐ๋ง์˜ ๋Œ€ํ‘œ ์ ‘๊ทผ๋ฒ•
    • User-based Filtering
      • ํŠน์ • ์‚ฌ์šฉ์ž(User)๋ฅผ ์„ ํƒ ex) SNS ์นœ๊ตฌ ์ถ”์ฒœ ์„œ๋น„์Šค
      • ํ‰์  ์œ ์‚ฌ๋„ ๊ธฐ๋ฐ˜ ์œ ์‚ฌ ์‚ฌ์šฉ์ž ์ฐพ๊ธฐ
        - ์œ ์‚ฌ ์‚ฌ์šฉ์ž๊ฐ€ ์ข‹์•„ํ•œ Item ์ถ”์ฒœ
    • Item-based Filtering
      • ํŠน์ • ์•„์ดํ…œ(Item)์„ ์„ ํƒ ๋ฐ ํŠน์ • Item์„ ์ข‹์•„ํ•œ ์‚ฌ์šฉ์ž ์ฐพ๊ธฐ
      • ํ•ด๋‹น ์‚ฌ์šฉ์ž๋“ค์ด ์ข‹์•„ํ–ˆ๋˜ ๋‹ค๋ฅธ Item ์ฐพ๊ธฐ

  • ์œ ์‚ฌ๋„ ์ธก์ • ๋ฐฉ๋ฒ•

    • ๋‘ ๋ฐฉ๋ฒ• ๋ชจ๋‘ ์œ ์‚ฌ๋„(๊ฑฐ๋ฆฌ)๋ฅผ ์ธก์ •ํ•ด์„œ ์‚ฌ์šฉ

    • ์ผ๋ฐ˜์ ์ธ ๊ฑฐ๋ฆฌ ์ธก์ • ๋ฐฉ๋ฒ•๋ก  3๊ฐ€์ง€

      • Cosine Similarity

        • ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ์œ ์‚ฌ๋„๋กœ, ์‚ฌ์šฉ์ž u์™€ ์‚ฌ์šฉ์ž u'๊ฐ€ ๊ฐ™์€ ๋ฐฉํ–ฅ์„ฑ์„ ๋ณด๊ณ  ์žˆ๋Š”์ง€ ํ‰๊ฐ€
      • Pearson Correlation Similarity

        • ํ‰๊ท ์  ๊ฒฝํ–ฅ์„ฑ์—์„œ ์–ผ๋งˆ๋‚˜ ์ฐจ์ด๊ฐ€ ๋‚˜๋Š”์ง€๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋ฉฐ, Centered Cosine Similarity๋ผ๊ณ ๋„ ํ•จ.
        • ์‚ฌ์šฉ์ž u์™€ ์‚ฌ์šฉ์ž u'๊ฐ€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•œ ์•„์ดํ…œ์ด ๋งŽ์•„์•ผ ํ•œ๋‹ค๋Š” ๋‹จ์ ์ด ์žˆ์Œ.

  • ์œ ์‚ฌ๋„๋ฅผ ๊ณ„์‚ฐํ•œ ํ›„ ์šฐ๋ฆฌ๋Š” ์–ด๋–ป๊ฒŒ ์•„์ดํ…œ(Item)์„ ์ถ”์ฒœํ•  ์ˆ˜ ์žˆ์„๊นŒ?
    • User-Item ํ–‰๋ ฌ(Matrix)๋ฅผ ๊ตฌ์ถ•
      • ๋ชฉํ‘œ : 4๋ฒˆ์งธ ์‚ฌ์šฉ์ž(User)์—๊ฒŒ ์˜ํ™”๋ฅผ ์ถ”์ฒœํ•ด์ฃผ๊ธฐ ์œ„ํ•ด์„œ, 4๋ฒˆ์งธ ์‚ฌ์šฉ์ž๊ฐ€ ์•„์ง ๋ณด์ง€ ์•Š์€ ์˜ํ™”์˜ ํ‰์ ์„ ์˜ˆ์ธก
    • User๊ฐ„์˜ ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ
    • ์˜ˆ์ƒ ์•„์ดํ…œ์˜ ํ‰์  ์ถ”๋ก 
      • 4๋ฒˆ์งธ ์‚ฌ์šฉ์ž๊ฐ€ ๋ณด์ง€ ์•Š์€ ์˜ํ™”๋“ค์„ ๋ณธ ์‚ฌ์šฉ์ž์˜ ํ‰์ ์„ ์ถ”์ถœ
      • 4๋ฒˆ์งธ ์‚ฌ์šฉ์ž์™€์˜ ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ
      • ๊ฐ ์‚ฌ์šฉ์ž ๋ณ„ ์œ ์‚ฌ๋„ x ํ‰์  ๊ณ„์‚ฐ
      • ์œ ์‚ฌ๋„๊ฐ€ ๋ฐ˜์˜๋œ ๊ฐ€์ค‘์น˜ ํ‰์ ์„ ํ•ฉ์‚ฐ
      • ๊ฐ€์ค‘์น˜๋ฅผ ๋‚˜๋ˆ„์–ด ํ‰๊ท  ํ‰์ ์„ ๊ณ„์‚ฐ -> 4๋ฒˆ์งธ ์‚ฌ์šฉ์ž์˜ ๋ณด์ง€ ์•Š์€ ์˜ํ™”์˜ ํ‰์  ์ถ”๋ก 

  • Memory-based Approach์˜ ์žฅ๋‹จ์ 

    • ์žฅ์ 

      • ์ ‘๊ทผ ๋ฐฉ์‹์ด ์‰ฌ์šฐ๋ฉฐ, ์ตœ์ ํ™”(Optimization)๋‚˜ ํ›ˆ๋ จ(Train)์ด ํ•„์š” ์—†์Œ
    • ๋‹จ์ 

      • ํฌ์†Œ(Sparse)๋ฐ์ดํ„ฐ๊ฐ€ ๋ฐœ์ƒํ•˜๋ฉด -> ์„ฑ๋Šฅ ์ €ํ•˜
        • ๋น„๊ต ๋Œ€์ƒ์ด ์ ์–ด์ง€๊ธฐ ๋•Œ๋ฌธ
      • ํ™•์žฅ์„ฑ์— ์ œํ•œ ( โˆต ๋น„๊ต ๋Œ€์ƒ์ด ๋งŽ์•„์ง€๋ฉด, ๊ณ„์‚ฐ๋Ÿ‰์ด ์ฆ๊ฐ€)


Day6


โ˜๏ธ ํ˜‘์—… ํ•„ํ„ฐ๋ง์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด์ž.(2)

์ถœ์ฒ˜ : ์ถ”์ฒœ ์‹œ์Šคํ…œ ๊ธฐ๋ณธ - ํ˜‘์—… ํ•„ํ„ฐ๋ง(Collaborative Filtering) - โ‘ก

  • Model based Approach
    • Memory Based Approach์˜ ๋‹จ์ ์„ ์ฑ„์šธ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ์‹
    • Machine Learning ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด, ์‚ฌ์šฉ์ž๊ฐ€ ์•„์ง ํ‰๊ฐ€ํ•˜์ง€ ์•Š์€ ์•„์ดํ…œ์˜ ํ‰์  ์˜ˆ์ธก
    • ๊ธฐ๋ณธ IDEA : ์‚ฌ์šฉ์ž์˜ ์„ ํ˜ธ๋„๋Š” ์†Œ์ˆ˜์˜ Hidden Factor๋กœ ๊ฒฐ์ •๋  ์ˆ˜ ์žˆ๋‹ค!
    • ํ–‰๋ ฌ ๋ถ„ํ•ด(MF;Matrix Factorization) ์‚ฌ์šฉ
    • ๋น„๋ชจ์ˆ˜์ (Non-Parametric) ๋ฐฉ๋ฒ•
    • ๋”ฅ๋Ÿฌ๋‹(Deep Learning) ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•

  • ์ฒซ ๋ฒˆ์งธ, MF(ํ–‰๋ ฌ ๋ถ„ํ•ด)๋ž€?

    • ex) User ๋ฐ Item ๋ชจ๋‘์— ๋Œ€ํ•ด 5์ฐจ์› ์ž„๋ฒ ๋”ฉ( n_factor = 5 ) ๊ฐ€ ์žˆ๋‹ค๊ณ  ๊ฐ€์ •

      • User-Item ํ–‰๋ ฌ์„ "User-X",์™€ "Item-A" ํ–‰๋ ฌ๋กœ ๋ณ€ํ™˜
    • MF๋Š” ์—ฌ๋Ÿฌ ๋ฐฉ๋ฒ• ์ˆ˜ํ–‰ ๊ฐ€๋Šฅ

      • Orthogonal Factorization(Singular Vector Decomposition(SVD) ํŠน์ด๊ฐ’ ๋ถ„ํ•ด)
      • Non-Negative Matrix Factorization(NMF)
      • Probabilistic Factorization(PMF) ๋“ฑ ์—ฌ๋Ÿฌ๊ฐ€์ง€๊ฐ€ ์žˆ์Œ.
      • ๊ฐ๊ฐ์˜ User-X Embedding๊ณผ Movie-A Embedding์€ ์—ฌ๋Ÿฌ ์š”์•ฝ์ ์ธ ํŠน์„ฑ์„ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ์Œ
        • Movie-A Embedding ์˜ˆ์‹œ
          • 1) ๊ณต์ƒ๊ณผํ•™๊ณผ ๊ด€๋ จ๋œ ์š”์†Œ 2) ์ตœ๊ทผ ์˜ํ™”์ธ์ง€์— ๋Œ€ํ•œ ์š”์†Œ
        • User-X Embedding ์˜ˆ์‹œ
          • 1) ๊ณต์ƒ๊ณผํ•™์˜ํ™”๋ฅผ ์–ผ๋งˆ๋‚˜ ์ข‹์•„ํ•˜๋Š”์ง€ 2) ์ตœ๊ทผ ์˜ํ™”๋ฅผ ์–ผ๋งˆ๋‚˜ ์„ ํ˜ธํ•˜๋Š”์ง€
        • ๊ทธ๋Ÿฌ๋‚˜, ์‹ค์ œ๋กœ๋Š” ๊ฐ Factor๊ฐ€ ๋ฌด์—‡์„ ์˜๋ฏธํ•˜๋Š”์ง€ ์ •ํ™•ํžˆ ์•Œ ์ˆ˜ ์—†๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์Œ.
      • User-X Embedding๊ณผ Movie-A Embedding์˜ ๋‚ด์  ๊ฐ’์ด ๋†’์„์ˆ˜๋ก -> User-X์—๊ฒŒ Movie-A๊ฐ€ ๋” ์ข‹์€ ์ถ”์ฒœ์ž„์„ ์˜๋ฏธํ•˜๊ฒŒ ๋จ!

  • ๋‘ ๋ฒˆ์งธ, ๋น„๋ชจ์ˆ˜(Non-Parametric)์  ์ ‘๊ทผ๋ฒ•
    • Memory-based ์ถ”์ฒœ ์‹œ์Šคํ…œ์˜ ์•„์ด๋””์–ด์™€ ์œ ์‚ฌํ•˜๋ฉฐ, ์‚ฌ์šฉ์ž(User)์™€ ์•„์ดํ…œ(Item)์˜ ์œ ์‚ฌ๋„๋ฅผ ์‚ฌ์šฉํ•จ.
    • ๊ฐ€์ค‘์น˜ ํ‰๊ท ์„ ํ†ตํ•ด ์‚ฌ์šฉ์ž์˜ ์•„์ดํ…œ(Item)์˜ ํ‰๊ฐ€(Rating)๋ฅผ ์˜ˆ์ธก
      • ์ฐจ์ด์ ์€?
        • Pearson Correlation์ด๋‚˜ Cosine Similarity๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋Œ€์‹ , Unsupervised Learning ๋ชจ๋ธ ์‚ฌ์šฉ
        • K๊ฐœ์˜ ์ด์›ƒ(Neighbor)ํ•œ ์‚ฌ์šฉ์ž ์ˆ˜๋กœ ์ œํ•œํ•˜์—ฌ, ์‹œ์Šคํ…œ ํ™•์žฅ์„ฑ ๋†’์ž„

  • ์„ธ ๋ฒˆ์งธ, ๋”ฅ๋Ÿฌ๋‹(Deep Learning)์  ์ ‘๊ทผ๋ฒ•
    • Deep Learning์„ ํ™œ์šฉํ•œ ํ˜‘์—… ํ•„ํ„ฐ๋ง ๊ด€๋ จ ์ž๋ฃŒ๋Š” ์•„์ง ๋งŽ์ด ๋ถ€์กฑํ•œ ํŽธ
    • ํ–‰๋ ฌ ๋ถ„ํ•ด์˜ ํ™•์žฅ์œผ๋กœ ์ƒ๊ฐํ•ด๋„ ๋ฌด๋ฐฉ(ํ–‰๋ ฌ ๋ถ„ํ•ด ๊ฒฐ๊ณผ๋ฅผ Input์œผ๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅ)
      - SVD, PCA : ํฌ์†Œ(Sparse) ํ–‰๋ ฌ์„ 2๊ฐœ์˜ ๋‚ฎ์€ Rank์˜ ์ง๊ต ํ–‰๋ ฌ(User-X, Item-A)๋กœ ๋ถ„ํ•ด
      - Deep Learning : Embeddding ์ง๊ต๋ฅผ ํ•˜์ง€ ์•Š๊ณ ๋„, ์ž์ฒด ํ•™์Šต์„ ํ•ด์•ผ ํ•จ.
      - User-Item์˜ ์กฐํ•ฉ์œผ๋กœ๋ถ€ํ„ฐ ์กฐํšŒํ•  ์ˆ˜ ์žˆ์œผ๋‹ˆ ๋น„์„ ํ˜•(Non-linear : ReLU)๋‚˜ ์„ ํ˜•(Linear), Sigmoid Layer๋ฅผ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜, ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜(SGD, Adam)์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์ค‘์น˜๋ฅผ ํ•™์Šตํ•˜๋Š” ๋“ฑ์˜ ๋ฐฉ์‹์„ ํ™œ์šฉํ•ด๋ณผ ์ˆ˜ ์žˆ์Œ.

  • ํ˜‘์—… ํ•„ํ„ฐ๋ง์˜ ํ•œ๊ณ„ ๊ทน๋ณต ๋ฐฉ์•ˆ
    (์ปจํ…์ธ  ๊ธฐ๋ฐ˜ ํ•„ํ„ฐ๋ง ๋ฐ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ์˜ ๊ฒฝ์šฐ, ์œ„์˜ ๋‚ด์šฉ๊ณผ ๊ฒน์ณ ๋ณ„๋„๋กœ ์ž‘์„ฑํ•˜์ง€ ์•Š์Œ.)
    • ๋จธ์‹ ๋Ÿฌ๋‹ ์ถ”์ฒœ ์‹œ์Šคํ…œ
      • ์‚ฌ์šฉ์ž์˜ ์กฐํšŒ, ํด๋ฆญ ๋“ฑ์˜ ์‚ฌ์†Œํ•œ ํ–‰๋™๊นŒ์ง€ ๋ชจ๋‘ ํ•™์Šตํ•˜์—ฌ ์‚ฌ์šฉ์ž์—๊ฒŒ ์ถ”์ฒœํ•  ํ›„๋ณด๊ตฐ์„ ์ œ์•ˆํ•˜๊ณ , ์‚ฌ์šฉ์ž์˜ ๋ฐ˜์‘์„ ํ•™์Šตํ•ด ๋” ์ •๊ตํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋„์ถœํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ.


Day7


โ˜๏ธ ๋”ฅ๋Ÿฌ๋‹ ์ถ”์ฒœ ๋ชจ๋ธ ๋…ผ๋ฌธ ์ •๋ณด

Q. ๊ทธ๋ ‡๋‹ค๋ฉด ๋”ฅ๋Ÿฌ๋‹ ์ถ”์ฒœ ๋ชจ๋ธ์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๋Š” ์–ด๋–ป๊ฒŒ ๋˜๊ณ  ์žˆ์„๊นŒ?


A. ๋‹ค์Œ ๋…ผ๋ฌธ์„ ํ†ตํ•ด ์•Œ์•„๋ณด์ž.

์ถœ์ฒ˜ : [๋…ผ๋ฌธ์š”์•ฝ] ๋”ฅ๋Ÿฌ๋‹ ๊ด€๋ จ ์ถ”์ฒœ ๋ชจ๋ธ - Survey(2019)

์ถ”์ฒœ ์‹œ์Šคํ…œ์—์„œ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ํ•„์š”์„ฑ

  • End-to-End ํ•™์Šต

    • Contents-based recommendation์—์„œ ๋ฆฌ๋ทฐ๋‚˜ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ฒƒ์€ ํ•„์ˆ˜์ 
    • ์ด๋ฏธ์ง€ ๋ฐ ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ„๋„ ์ „์ฒ˜๋ฆฌ ๋น„์šฉ์—†์ด, End-to-End ํ•™์Šต
  • Inductive Biases(Generalization๊ณผ ์œ ์‚ฌ)

    • ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ Sequential ํ•œ ๋ฐ์ดํ„ฐ๋‚˜ Click-Log ๋ฐ์ดํ„ฐ์— ์ ํ•ฉํ•˜๊ธฐ ๋•Œ๋ฌธ
  • ํฐ ๋ฐ์ดํ„ฐ ๋ฐ ๋ณต์žกํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐ ๊ฐ€๋Šฅ

    • Collaborative Ranking, Matrix Completion ๋“ฑ์˜ ๋ฌธ์ œ ์ฒ˜๋ฆฌ ์‹œ, ์—„์ฒญ๋‚œ ์–‘๊ณผ ๋”๋ถˆ์–ด ๋ณต์žกํ•œ ํ•™์Šต ๋ฐ์ดํ„ฐ๊ฐ€ ํ•„์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ
    • ์ด๋•Œ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ์—„์ฒญ๋‚œ ์–‘์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šตํ•˜์—ฌ, ๋ณต์žกํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•จ
    • GPU์˜ ๋ฐœ์ „์œผ๋กœ, ๊ธฐ์กด์˜ ์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜(Matrix Factorization ๋“ฑ)๋„ ์‹ ๊ฒฝ๋ง ๊ตฌ์กฐ๋กœ ๋ณ€ํ™˜์ด ๊ฐ€๋Šฅํ•จ.

์ถ”์ฒœ์‹œ์Šคํ…œ์—์„œ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ์žฅ์  4๊ฐ€์ง€

  • Non-Linear Transformation

    • ๋ณต์žกํ•œ User-Item์˜ ์ƒํ˜ธ์ž‘์šฉ ํŒจํ„ด์„ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์Œ.
    • ๊ธฐ์กด Matrix Factorization ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ -> ์„ ํ˜• ๋ชจ๋ธ์ด๋ฏ€๋กœ, ํ‘œํ˜„๋ ฅ ํ•™์Šต์— ์ œ์•ฝ์ด ์žˆ์Œ.
  • Representation Learning

    • Input data์˜ ํŠน์ง•์— ๋Œ€ํ•œ ํšจ๊ณผ์ ์ธ ํ•™์Šต
    • Hand-Craft Feature Design์„ ์ค„์—ฌ Feature Engineering ๋น„์šฉ ๊ฐ์†Œ
    • ์ด๋ฏธ์ง€ ๋ฐ ํ…์ŠคํŠธ ๋“ฑ์˜ ๋น„์ •ํ˜• ๋ฐ์ดํ„ฐ์˜ ํšจ๊ณผ์ ์ธ ์ฒ˜๋ฆฌ ๊ฐ€๋Šฅ
  • Sequence Modeling

    • Sequenceํ•œ ํŠน์„ฑ์„ ๊ฐ€์ง€๋Š” ๋ฐ์ดํ„ฐ(User์˜ ํ–‰๋™, Item์˜ ๋ณ€๊ฒฝ)์—์„œ๋„ ํšจ๊ณผ์ ์ธ ์ž‘๋™ ๊ฐ„์œผ
    • ๋‹ค์Œ์— ๊ตฌ๋งค ๊ฐ€๋Šฅํ•œ Item ์ถ”์ฒœ ๊ฐ€๋Šฅ
  • Flexibility

  • Keras, Tensorflow, Pytorch์™€ ๊ฐ™์€ Framework๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Œ.


์ถ”์ฒœ์‹œ์Šคํ…œ์—์„œ์˜ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ๋‹จ์  3๊ฐ€์ง€

  • Interpretability

    • ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ ํ•ด์„์˜ ์–ด๋ ค์›€ ์กด์žฌ(Black-Box ๋ชจ๋ธ)
    • Atttention model์„ ์‚ฌ์šฉํ•˜๋ฉด ํ•ด์„๋  ๊ฐ€๋Šฅ์„ฑ์ด ์กฐ๊ธˆ ๋” ์žˆ๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž„.
  • Data Requirement

    • ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋Œ€๋Ÿ‰ ๋ฐ์ดํ„ฐ๊ฐ€ ํ•„์š”ํ•œ๋ฐ ์ด๋ฏธ์ง€ ๋ฐ ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค, ์ถ”์ฒœ ์‹œ์Šคํ…œ์—์„œ์˜ ๋ผ๋ฒจ๋ง์ด ์™„๋ฃŒ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๋Š” ๊ฒƒ์ด ์šฉ์ดํ•จ.
  • Extensive Hyper-parameter Tuning

  • ๋จธ์‹ ๋Ÿฌ๋‹์˜ ์ „๋ฐ˜์ ์ธ ํ•œ๊ณ„์ ์œผ๋กœ, ๋งค์šฐ ๋งŽ์€ ์ˆ˜์˜ Hyper-Parameter ๋ฌธ์ œ๊ฐ€ ์žˆ์Œ.

  • ๋‹จ์ผ Hyper-Parameter๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜๊ณ  ์žˆ๋‹ค๊ณ  ํ•จ.


์ถ”์ฒœ ์‹œ์Šคํ…œ์—์„œ์˜ ๋”ฅ๋Ÿฌ๋‹ - SOTA

SOTA?

  • ๊ฐ ๋ถ„์•ผ์˜ ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ๋ชจ๋ธ์„ ๋œปํ•จ.
  • MLP
    • ๊ธฐ์กด์˜ ์„ ํ˜•์  ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ์ถ”์ฒœ ์‹œ์Šคํ…œ ๋Œ€์‹ , ๋น„์„ ํ˜•์„ฑ์„ ์ถ”๊ฐ€ํ•ด ํ™•์žฅํ•จ.
    • ์ถ”์ฒœ ๋ชจ๋ธ์€ ๋Œ€๋ถ€๋ถ„ User์˜ ์„ ํ˜ธ๋„(Preference)์™€ Item Feature ๊ฐ„์˜ ์–‘๋ฐ˜ํ–ฅ์„ฑ ์ƒํ˜ธ์ž‘์šฉ์„ ํ™œ์šฉ
    • ์ถ”์ฒœ ์‹œ์Šคํ…œ์—์„œ์˜ MPL ๋ชจ๋ธ๋กœ๋Š” 'Locally-Connected wide & deep Learning Model', 'Youtube DNN', 'Collaborative Metric Learning(CML)'์ด ์žˆ์Œ.

  • Autoencoder

  • ํ•™์Šต ๋ฐฉ์•ˆ

    • Bottleneck Layer -> Feature Representation์„ Low-dimensional Feature๋กœ ํ•™์Šต
    • Interaction Matrix์˜ ๊ณต๋ฐฑ์„ ์ง์ ‘ ์ฑ„์šฐ๋Š” Reconstruction Layer ๋ฐฉ์‹
      • Denoising ๋ฐ Variational Autoencoder ๋“ฑ ๋Œ€๋ถ€๋ถ„์˜ Autoencoder ์ถ”์ฒœ ๋ถ„์•ผ์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅ!
  • Autoencoder ๊ธฐ๋ฐ˜ ํ˜‘์—… ํ•„ํ„ฐ๋ง

  • User๋‚˜ Item์˜ Partial Vector๋ฅผ ์ž…๋ ฅํ•˜๋ฉด Output Layer์—์„œ ์žฌ๊ตฌ์„ฑ
    - User์˜ ๋ฒกํ„ฐ์—์„œ ๋” ๋†’์€ ๋ถ„์‚ฐ์„ ๊ฐ€์ง€๋ฏ€๋กœ, Item-AutoRec > User-AutoRec ์„ฑ๋ฆฝ

    • Activation ํ•จ์ˆ˜์˜ ์กฐํ•ฉ์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์ด ๋‹ฌ๋ผ์ง.
    • Hidden Unit์˜ ํฌ๊ธฐ โฌ†๏ธ -> ์„ฑ๋Šฅ์ด ๋” ์ข‹์•„์ง
      • ๋ชจ๋ธ์˜ Capacity๊ฐ€ ์ปค์ง€๋ฉด Input์˜ ํŠน์ •์„ ๋” ์ž˜ ๋ชจ๋ธ๋งํ•˜๊ธฐ ๋•Œ๋ฌธ.
    • ๋” ๋งŽ์€ Layer๋ฅผ ์Œ“์œผ๋ฉด ์„ฑ๋Šฅ์„ ๋” ํฌ๊ฒŒ ๊ฐœ์„  ๊ฐ€๋Šฅ

  • Convolutional Neural Network(CNN)
  • ํ•™์Šต ๋ฐฉ์•ˆ
    • CNN์€ ๋น„์ •ํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šต์— ์‚ฌ์šฉ
    • CNN ๋ชจ๋ธ์„ ํ†ตํ•ด ์‹œ๊ฐ์ ์ธ ์ฝ˜ํ…์ธ ์™€ User์˜ Factor ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ํƒ์ƒ‰ํ•˜์—ฌ ์ด๋ฏธ์ง€์˜ Feature๋ฅผ ์ถ”์ถœ
  • CNN ๊ธฐ๋ฐ˜ ์ถ”์ฒœ ์‹œ์Šคํ…œ
    • ContagNet : CNN ๋ฐ MLP๋ฅผ concatํ•ด์„œ ์‚ฌ์šฉ
    • Comparative Deep Learning(CDL) : ๊ธ์ •์ ์ธ ์ด๋ฏธ์ง€์™€ ๋ถ€์ •์ ์ธ ์ด๋ฏธ์ง€๋ฅผ ํ•™์Šต
  • CNN ๊ธฐ๋ฐ˜ ํ˜‘์—… ํ•„ํ„ฐ๋ง
  • Outer Product๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Interaction Map ์ƒ์„ฑ, CNN์ด ์ž„๋ฒ ๋”ฉ ์ฐจ์› ๊ฐ„์˜ ๊ณ ์ฐจ์› ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ์บก์ณ

  • Graph CNNs
    • SNS ๋˜๋Š” Knowledge Graph ๋“ฑ์˜ ๊ด€๊ณ„์„ฑ ๋ฐ์ดํ„ฐ์— ์ ํ•ฉ
    • ์ถ”์ฒœ ๋ถ„์•ผ๋Š” ์ด๋ถ„ ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„ ๊ฐ€๋Šฅ
    • ํ•€ํ„ฐ๋ ˆ์ŠคํŠธ์—์„œ Graph CNNs ๊ธฐ๋ฐ˜ ์ถ”์ฒœ ์‹œ์Šคํ…œ์„ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ํ•จ.

  • Attention ๊ธฐ๋ฐ˜ ์ถ”์ฒœ ์‹œ์Šคํ…œ

    Attention ๋ฉ”์ปค๋‹ˆ์ฆ˜?

    • ์‚ฌ๋žŒ์ด ์ „์ฒด ์ด๋ฏธ์ง€๋‚˜ ๋ฌธ์žฅ ์ค‘ ํŠน์ •ํ•œ ๋ถ€๋ถ„๋งŒ์„ ์ง‘์ค‘ํ•˜๋Š” ๊ฒƒ์—์„œ ์•„์ด๋””์–ด๋ฅผ ์–ป์Œ
  • Attention Score๋กœ Input์„ ์ ์ˆ˜ํ™”

  • Attention ์ข…๋ฅ˜ 2๊ฐ€์ง€ : Vanilla Attention, Co-Attention

์ด์™ธ์˜ ์ž์„ธํ•œ ๋‚ด์šฉ ๋ฐ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐ ๋ฏธํ•ด๊ฒฐ ๊ณผ์ œ์— ๋Œ€ํ•œ ๋ถ€๋ถ„์€ ์ด ๊ณณ์„ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”!
๐Ÿ“ [๋…ผ๋ฌธ์š”์•ฝ] ๋”ฅ๋Ÿฌ๋‹ ๊ด€๋ จ ์ถ”์ฒœ ๋ชจ๋ธ - Survey(2019)

profile
์–ธ์  ๊ฐ€ ๋‚ด ์ฝ”๋“œ๋กœ ์„ธ์ƒ์— ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ๋„๋ก, BE&Data Science ๊ฐœ๋ฐœ ๊ธฐ๋ก ๋…ธํŠธโ˜˜๏ธ

0๊ฐœ์˜ ๋Œ“๊ธ€