class ChatFormat:
def __init__(self, tokenizer: Tokenizer):
self.tokenizer = tokenizer
def encode_header(self, message: Message) -> List[int]:
tokens = []
tokens.append(self.tokenizer.special_tokens["<|start_header_id|>"])
tokens.extend(self.tokenizer.encode(message["role"], bos=False, eos=False)) # message가 dictionary 형태 인 것 같은데, 여기에 role로 user, assistant 같은 것들이 드감
tokens.append(self.tokenizer.special_tokens["<|end_header_id|>"])
tokens.extend(self.tokenizer.encode("\n\n", bos=False, eos=False))
return tokens
def encode_message(self, message: Message) -> List[int]:
tokens = self.encode_header(message)
tokens.extend(
self.tokenizer.encode(message["content"].strip(), bos=False, eos=False)
)
tokens.append(self.tokenizer.special_tokens["<|eot_id|>"]) # end of text.
return tokens
def encode_dialog_prompt(self, dialog: Dialog) -> List[int]: # Dialog는 뭐지
tokens = []
tokens.append(self.tokenizer.special_tokens["<|begin_of_text|>"])
for message in dialog:
tokens.extend(self.encode_message(message))
# Add the start of an assistant message for the model to complete.
tokens.extend(self.encode_header({"role": "assistant", "content": ""}))
return tokens
The list.extend()
method in Python is used to extend a list by appending all the elements from another iterable (such as another list, tuple, string, etc.) to the end of the list. It modifies the original list in place and increases its length by the number of elements in the iterable.
list.extend(iterable)
The extend()
method iterates over the elements in the provided iterable and appends each element to the end of the original list. It is similar to the +=
operator when used with lists but is often more explicit and readable.
# Example 1: Extending a list with another list
fruits = ['apple', 'banana', 'cherry']
additional_fruits = ['orange', 'grape']
fruits.extend(additional_fruits)
print(fruits)
Output:
['apple', 'banana', 'cherry', 'orange', 'grape']
In this example, the fruits
list is extended by appending the elements of additional_fruits
to the end.
# Example 2: Extending a list with a string
letters = ['a', 'b', 'c']
letters.extend('def')
print(letters)
Output:
['a', 'b', 'c', 'd', 'e', 'f']
Here, the letters
list is extended by appending each character in the string 'def'
to the end.
extend()
modifies the original list in place; it does not return a new list.iterable
passed to extend()
can be any iterable (list, tuple, string, etc.).append()
, which adds its argument as a single element (which could be another list), extend()
adds each element from the iterable to the list.append()
:append()
adds its argument as a single element to the end of the list. If you pass a list to append()
, the entire list is added as a single element.extend()
adds each element of the iterable to the list. If you pass a list to extend()
, each element of that list is added to the original list.Example:
numbers = [1, 2, 3]
numbers.append([4, 5])
print(numbers) # Output: [1, 2, 3, [4, 5]]
numbers.extend([6, 7])
print(numbers) # Output: [1, 2, 3, [4, 5], 6, 7]
In the first case, append()
adds the entire list [4, 5]
as a single element, while extend()
adds each element (6
and 7
) separately to the list.