ma-kjh (-) - velog

거인의 어깨에 올라서서 더 넓은 세상을 바라보라 - 아이작 뉴턴

태그 목록

[LLM sampling] token prediction

LLM(대규모 언어 모델)에서 마지막 토큰의 출력을 생성할 때, 다양한 샘플링 기법이 사용됩니다. 이러한 기법은 모델의 출력을 제어하여 보다 자연스럽거나 특정한 특징을 가지는 텍스트를 생성하도록 돕습니다. 아래는 주요 샘플링 기법과 그 예시에 대한 구체적인 설명입니다.

LLM

2025년 1월 20일

0개의 댓글

[metric] LLM evaluation metrics

ROUGEBLUEROUGE (Recall-Oriented Understudy for Gisting Evaluation)는 자동 요약이나 기계 번역 등의 자연어 생성 품질을 평가하기 위한 지표. ROUGE는 주로 생성된 텍스트와 기준(참조) 텍스트 사이에 얼마나 많은 n

LLM evaluation

2025년 1월 13일

0개의 댓글

[metric] Expected Calibration Error (ECE)

Expected Calibration Error (ECE)는 모델이 예측한 확률과 실제 정확도 간의 차이를 측정하는 지표.모델이 "0.8의 확률로 맞다"라고 예측할 때 실제로 약 80%의 정확도로 맞는지를 확인하여, 모델의 신뢰도(Confidence)를 평가.$$ECE

metric

2025년 1월 4일

0개의 댓글

[Pytorch]args, kwargs

파이썬에서 args와 kwwargs는 함수를 정의할 때 가변 인자(Variable-length arguments)를 받아들이기 위한 문법적 장치.여러 개의 위치 기반 인자를 튜플 형태로 전달받는다.함수나 메서드를 호출할 때, 인자의 개수가 가변적일 수 있도록 유연성을

PyTorch

2024년 12월 23일

0개의 댓글

[Pytorch]CrossEntropyLoss(weight)

CrossEntropyLoss에서 가중치(weight)는 softmax를 적용한 확률이 아니라, 최종 손실 값에 곱해짐. 따라서 가중치는 로그 확률에 곱해서 손실을 조정하는 역할.$$\\text{Loss} = -\\frac{1}{N} \\sum{i=1}^N \\text

PyTorch

2024년 12월 23일

0개의 댓글

[Pytorch]torch.contiguous()

contiguous()는 PyTorch에서 텐서의 메모리 레이아웃을 연속적(contiguous)으로 변환하는 역할을 함. 이 함수는 텐서가 연속적인 메모리 레이아웃을 가지지 않는 경우 이를 새로운 텐서로 만들어 반환하는 역할. PyTorch 텐서는 기본적으로 데이터를

PyTorch

2024년 12월 23일

0개의 댓글

[Pytorch]torch.stack

torch.stack은 리스트 또는 배열 형태의 텐서들을 하나의 torch.Tensor로 결합해주는 함수. 입력 텐서들을 새로운 차원으로 쌓는 역할.torch.stack은 여러 텐서를 주어진 새 축(dim)을 따라 연결. 즉, 입력 텐서 리스트의 요소들은 동일한 크기를

2024년 12월 23일

0개의 댓글

[Tokenizer]token_embedding

NLP 모델을 학습할 때 tokenizer의 output은 tokenized된 text 꾸러미들이다. (text_embedding이 아님)LLM(대규모 언어 모델)의 가장 앞단에서는 토크나이저의 출력(Token IDs)이 입력으로 사용되며, 이를 임베딩(Embeddin

LLM tokenizer

2024년 12월 23일

0개의 댓글

[Pytorch]super(class, self).init()

코드 보다보면, class를 정의할 때, super(class_name, self).\_\_init\_\_()을 를 사용하는 경우가 있는데 이게 왜 필요할까 .. ?부모 클래스에 이미 구현된 초기화 로직을 재사용할 수 있다. 이를 통해 코드의 중복을 줄이고 유지보수를

PyTorch

2024년 12월 23일

0개의 댓글

Nobel Prize in Physics 2024 Interview

GPT가 해석해줌.“그게 장난 전화가 아니라는 걸 어떻게 확신할 수 있었을까요?”2024년 물리학 노벨상 수상자 제프리 힌튼은 캘리포니아의 한 호텔 방에서 이른 아침 스톡홀름으로부터 전화를 받았고, 여러 스웨덴 억양의 목소리가 그의 노벨 물리학상 수상이 실제임을 확신하

2024년 10월 13일

0개의 댓글

[vLLM] GPU utilization

ref : https://sjkoding.tistory.com/91 ( 압도적 감사.)위 링크 들어가면 자세히 설명되어 있습니다.vllm은 대규모 언어 모델의 효율적인 추론을 위해 설계된 library이다. 모델 추론 중에 반복적으로 참조되는 데이터의 캐싱을

2024년 10월 10일

0개의 댓글

[conda] Jupyter kernel

python -m ipykernel install --user --name 가상환경이름 --display-name "커널출력이름"

2024년 10월 10일

0개의 댓글

[llama3/llama/generation.py][class Llama] def sample_top_p

In the context of the sample_top_p function you've provided, "cumulative probability mass" refers to the sum of probabilities for a sequence of tokens

2024년 8월 30일

0개의 댓글

[llama3/llama/generation.py][class Llama] def text_completion

bos=True: Ensures the BOS token is added, marking the start of each prompt.eos=False: Ensures the EOS token is not added, allowing the model to contin

LLM

2024년 8월 30일

0개의 댓글

[llama3/llama/generation.py][class Llama] def generate

text sequences를 만들어내는 함수.prompt를 입력으로 받아서 텍스트를 만들어냄.prompt_tokens (List\\\[List\\\[int\\]\\]): tokenized된 prompt의 리스트를 의미함. 각 프롬프트는 list of integer를 의

2024년 8월 28일

0개의 댓글

[llama3/llama/tokenizer.py] class ChatFormat

The list.extend() method in Python is used to extend a list by appending all the elements from another iterable (such as another list, tuple, string,

LLM

2024년 8월 28일

0개의 댓글

[llama3/llama/model.py][class TransformerBlock] def init, def forward

RMSNorm (Root Mean Square Normalization) is a normalization technique used in the architecture of large language models like LLaMA (Large Language Mod

LLM LLaMA python transformer

2024년 8월 28일

0개의 댓글

[llama3/llama/model.py][class Transformer] def init and def forward

[llama3/llama/generate.py def build](https://velog.io/@ma-kjh/llama3llamageneration.pyclass-Llama-def-build) 에서 Transformer

LLM LLaMA transformer

2024년 8월 27일

0개의 댓글

[llama3/llama/generation.py][class Llama] def build def init

class llama에 대해 알아보자.해당 build 함수는 모델 체크포인트를 로딩하고 initializing해서 Llama instance를 빌드하는 과정.Args:가장 먼저 build가 정의되어 있다.ckpt_dir (str) : checkpoint file이 들어

LLM LLaMA code python

2024년 8월 27일

0개의 댓글

[typing] Optional

Optional\[int] = None같은 코드는 뭘 의미하고 있을까..typing module의먼저 Optional의 사용은 몇가지 type hinting과 default value에 관련되어있다.Type Hinting : The Optional type is use

python

2024년 8월 27일

0개의 댓글

[LLM sampling] token prediction

[metric] LLM evaluation metrics

[metric] Expected Calibration Error (ECE)

[Pytorch]args, kwargs

[Pytorch]CrossEntropyLoss(weight)

[Pytorch]torch.contiguous()

[Pytorch]torch.stack

[Tokenizer]token_embedding

[Pytorch]super(class, self).__init__()

Nobel Prize in Physics 2024 Interview

[vLLM] GPU utilization

[conda] Jupyter kernel

[llama3/llama/generation.py][class Llama] def sample_top_p

[llama3/llama/generation.py][class Llama] def text_completion

[llama3/llama/generation.py][class Llama] def generate

[llama3/llama/tokenizer.py] class ChatFormat

[llama3/llama/model.py][class TransformerBlock] def __init__, def forward

[llama3/llama/model.py][class Transformer] def __init__ and def forward

[llama3/llama/generation.py][class Llama] def build def __init__

[typing] Optional

[Pytorch]super(class, self).init()

[llama3/llama/model.py][class TransformerBlock] def init, def forward

[llama3/llama/model.py][class Transformer] def init and def forward

[llama3/llama/generation.py][class Llama] def build def init