Prototype 설계

Jihoon·2023년 1월 19일

AI Backend Prototype data frontend model recommend steamgame

✅Backend

1. Baseline 구축

backend 폴더에 __main__.py 설정
기존 inference main.py 수정 및 inference 폴더에 설정
uvicorn.run 에서 폴더 위치만 수정해주면 바로 실행 가능

2. 유저의 플레이정보를 인퍼런스 서버로 POST

BSS get, post 고민하는 부분은 POST이지 않을까?

why? get은 있는 부분을 얻고자 하는 것이고, POST는 정보를 요청하는 것 !

User 정보를 받아 오는 것이니까

Login

Update User ▶ BSS Code 참고

✅Frontend

data = { 'id' :76561198117856251 }

@app.get("/login2")
def login2(request: Request):
    steam_id=request.query_params['openid.identity'][-17:]
    response = request.post('http://localhost:8001/recom', json = data)
    title = response.json()["products"][0]["title"]
    images = response.json()["products"][0]["images"]
    
    return templates.TemplateResponse('login2.html', context={"request":request,"steam_id":steam_id})

✅Model

수정 사항

1. Inference 출력 형태 Dataframe → title, images ( + 추가 이외 요소 필요)

2. Model_load 함수

why? inference server에서 get_model_rec에 넣어줘야 함

# Load Model
def get_model(model_path: str = "ml/best_model.pth") -> NeuMF:
    """Model을 가져옵니다"""
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model = NeuMF(8735, 64, 1,0.05).to(device)
    model.load_state_dict(torch.load(model_path, map_location=device))
    return model

3. get_model_rec 함수

Prototype에서는 Front에서 받아오는 데이터를 쓸 필요가 없어서 load_model를 통해서 가져온 모델만 input으로 넣어주면 됌
기존 main() 함수를 명칭만 변형
→ import해서 받아올 함수들 input 설정

def get_model_rec_prototype(get_user, model, inference):
    origin = pd.read_csv('useritem.csv')
    test_input = get_user(76561198117856251, origin)
    pred_list = inference(model, test_input, 8735)
    
    return pred_list

4. inference 함수

input → model 추가 및 적용

def inference_(model, test, n_items):
    item_encoder = joblib.load('item_encoder.joblib')

    model = model(n_items, 64, 1,0.05).to('cuda')
    model.load_state_dict(torch.load('best_model.pth'))
    # TODO : 모델 파일 수정

    pred_list = []
    model.eval()
    
    query_user_ids = test['userid'].unique() # 추론할 모든 user array 집합
    full_item_ids = np.array([c for c in range(n_items)]) # 추론할 모든 item array 집합 
    for user_id in query_user_ids:
        with torch.no_grad():
            user_ids = np.full(n_items, user_id)
            
            user_ids = torch.LongTensor(user_ids).to('cuda')
            item_ids = torch.LongTensor(full_item_ids).to('cuda')
            
            
            eval_output = model.forward(user_ids, item_ids).detach().cpu().numpy()
            pred_u_score = eval_output.reshape(-1)   
        
        pred_u_idx = np.argsort(pred_u_score)[::-1]
        pred_u = full_item_ids[pred_u_idx]
        pred_list.append(list(pred_u[:50]))
    
    ######################################################################################################
    # 코드 수정 필요 
    df = pd.DataFrame()
    df['profile_id'] = query_user_ids
    df['predicted_list'] = pred_list
    # 어짜피 User 한 명씩 inference하니까 반복문 밖에서 코드 정의
    
    titles = {}
    posters = {}
    # top_k : 10 , 가져온 index를 기반으로 df상에서 title, image 가져와야 됌
    for col_index in range(
                10
            ):
            steam = df.iloc[col_index]
            titles[col_index] = steam["title"]
            posters[col_index] = steam["poster_link"] if steam["poster_link"] else Image.open("placeholder.png")
    
    title = [str(x) for x in titles.values()]
    poster = [str(x) for x in posters.values()]
    
    return title, poster

🤔 코드 토의할 부분

inference 함수랑 inference_epoch 합치는게 혼동이 덜 할 것 같다는 생각에서 합침

why? 코드가 별로 길지 않고, input이 item_encoder 역시 중복되기 때문에 굳이 따로 함수 구성하지 않아도 될 듯 !

CSV 파일 → inference_ 함수에 필요

컬럼명 확인 (titles, images)