[Rust] Traits and Generics

고승우·2024년 8월 2일

목록 보기

16/17

Rust supports polymorphism with two related features: traits and generics.

Trait Object

Most often, a trait represents a capability: something a type can do.

use std::io::Write;
let mut buf: Vec<u8> = vec![];
let writer: dyn Write = buf; // error: `Write` does not have a constant size

A variable’s size has to be known at compile time, and types that implement Write can be any size. References are explicit, so we can improve our code like this

let mut buf: Vec<u8> = vec![];
let writer: &mut dyn Write = &mut buf; // ok

A reference to a trait type, like writer, is called a trait object.

Trait object layout

In memory, a trait object is a fat pointer consisting of a pointer to the value, plus a pointer to a table representing that value’s type. Each trait object takes up two machine words.

In Rust, as in C++, the vtable is generated once, at compile time, and shared by all objects of the same type. The language automatically uses the vtable when you call a method of a trait object, to determine which implementation to call. The struct itself contains nothing but its fields. This way, a struct can implement dozens of traits without containing dozens of vptrs.

let mut local_file = File::create("hello.txt")?;
let w: Box<dyn Write> = Box::new(local_file);

Box<dyn Write>, like &mut dyn Write, is a fat pointer: it contains the address of the writer itself and the address of the vtable.

Generic Functions and Type Parameters

Let's compare these two functions

Fuction with Trait Object

Uses a trait object (&mut dyn Write). This means out can be any type that implements Write, but it's dynamically dispatched. The exact type is determined at runtime. This lead performance overhead due to dynamic dispatch(runtime type resolution).

fn say_hello(out: &mut dyn Write) // plain function

Trait objects are the right choice whenever you need a collection of values of mixed types, all together. Since Vegetable values can be all different sizes, we can’t ask Rust for a Vec<dyn Vegetable>:

struct Salad {
    veggies: Vec<dyn Vegetable> // error: `dyn Vegetable` does
}                               // not have a constant size

Trait objects are the solution:

struct Salad {
    veggies: Vec<Box<dyn Vegetable>>
}

Generic Function

Uses generics (W: Write). This means out is of a specific type that implements Write, and it's statically dispatched. The exact type is determined at compile time.

fn say_hello<W: Write>(out: &mut W) // generic function

Generic functions can have multiple type parametres, and the bounds can get to be so long that they are hard on the eyes. Rust provides an alternative syntax using the keyword where:

fn run_query<M, R>(data: &DataSet, map: M, reduce: R) -> Results
where
    M: Mapper + Serialize,
    R: Reducer + Serialize
{ ... }

Comparing Trait object to generic type

Generics have three important advantages over trait objects, with the result that in Rust, generics are the more common choice.
1. Speed: Specify the types at compile time, the compiler knows exactly which write method to call
2. Trait objects availability: TODO!
3. Easy to bound a generic type parameter with several traits at once: Trait objects can’t do this: types like &mut (dyn Debug + Hash + Eq) aren’t supported in Rust

Default Methods

trait Write {
    fn write(&mut self, buf: &[u8]) -> Result<usize>; fn flush(&mut self) -> Result<()>;
    fn write_all(&mut self, buf: &[u8]) -> Result<()> { 
        let mut bytes_written = 0;
        while bytes_written < buf.len() {
            bytes_written += self.write(&buf[bytes_written..])?;
        }
        Ok(()) 
    }
    ... 
}

In this example, the write and flush methods are the basic methods that every writer must implement. A writer may also implement write_all, but if not, the default implementation shown earlier will be used.

Traits and Other People’s Types

The trait adds a method to an existing type is called extension trait. Implementing the trait for all writers makes it an extension trait, adding a method to all Rust writers:

/// You can write HTML to any std::io writer.
impl<W: Write> WriteHtml for W {
    fn write_html(&mut self, html: &HtmlDocument) -> io::Result<()> { ... }
}

The line impl<W: Write> WriteHtml for W means “for every type W that implements Write, here’s an implementation of WriteHtml for W.”

Orphan Rule

when you implement a trait, either the trait or the type must be new in the current crate. It helps Rust ensure that trait implementations are unique.

Self in Traits

A trait can use the keyword Self as a type:

impl Spliceable for CherryTree {
    fn splice(&self, other: &Self) -> Self { ... }
}
impl Spliceable for Mammoth {
    fn splice(&self, other: &Self) -> Self { ... }
}

Inside the first impl, Self is simply an alias for CherryTree, and in the second, it’s an alias for Mammoth.

A trait that uses the Self type is incompatible with trait objects:

// error: the trait `Spliceable` cannot be made into an object
fn splice_anything(left: &dyn Spliceable, right: &dyn Spliceable) {
    let combo = left.splice(right);
    // ...
}

Rust rejects this code because it has no way to type-check the call left.splice(right). The whole point of trait objects is that the type isn’t known until run time. Now, had we wanted genetically improbable splicing, we could have designed a trait- object-friendly trait:

pub trait MegaSpliceable {
    fn splice(&self, other: &dyn MegaSpliceable) -> Box<dyn MegaSpliceable>;
}

Subtraits

We can declare that a trait is an extension of another trait:

/// Someone in the game world, either the player or some other
/// pixie, gargoyle, squirrel, ogre, etc.
trait Creature: Visible {
    fn position(&self) -> (i32, i32);
    fn facing(&self) -> Direction;
    ...
}

The phrase trait Creature: Visible means that all creatures are visible. Every type that implements Creature must also implement the Visible trait. It’s an error to implement Creature for a type without also implementing Visible. We say that Creature is a subtrait of Visible, and that Creature is Visible’s supertrait.

Type-Associated Functions

Traits can include type-associated functions, Rust’s analog to static methods:

trait StringSet {
	/// Return a new empty set.
    fn new() -> Self;

	/// Return a set that contains all the strings in `strings`.
    fn from_slice(strings: &[&str]) -> Self;

	/// Find out if this set contains a particular `value`.
    fn contains(&self, string: &str) -> bool;

	/// Add a string to this set.
    fn add(&mut self, string: &str);
}

The first two, new() and from_slice(), don’t take a self argument. They serve as constructors. In nongeneric code, these functions can be called using :: syntax, just like any other type-associated function:

// Create set of hypothetical types that impl StringSet:
let mut my_set = MyStringSet::new();

In generic code, it’s the same, except the type is often a type variable, as in the call to S::new() shown here:

fn unknown_words<S: StringSet>(document: &[String], wordlist: &S) -> S {
    S::new()
}

Trait objects don’t support type-associated functions. If you want to use &dyn StringSet trait objects, you must change the trait, adding the bound where Self: Sized to each associated function that doesn’t take a self argument by reference:

trait StringSet {
    fn new() -> Self
    where
        Self: Sized;
    fn from_slice(strings: &[&str]) -> Self
    where
        Self: Sized;
    fn contains(&self, string: &str) -> bool;
    fn add(&mut self, string: &str);
}

And then you can use trait object like this:

/// Suppose MyStringSet implemented StringSet trait
let mut my_set: Box<dyn StringSet> = Box::new(MyStringSet::new());
    my_set.add("world");
    println!("{}", my_set.contains("world")); // true

Associated Types(How Iterators Work)

Rust has a standard Iterator trait, defined like this:

pub trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
    ...
}

The first feature of this trait, type Item;, is an associated type. Each type that implements Iterator must specify what type of item it produces. next() returns an Option<Self::Item>: either Some(item), the next value in the sequence, or None when there are no more values to visit. The type is written as Self::Item, not just plain Item, because Item is a feature of each type of iterator, not a standalone type.

// (code from the std::env standard library module)
impl Iterator for Args {
    type Item = String;
    fn next(&mut self) -> Option<String> { ... }
    ... 
}

Also, generic code can use associated types:

/// Loop over an iterator, storing the values in a new vector.
fn collect_into_vector<I: Iterator>(iter: I) -> Vec<I::Item> {
    let mut results = Vec::new();
    for value in iter {
            results.push(value);
        }
    results
}

We can place a bound on I::Item:

fn dump<I>(iter: I)
where I: Iterator, I::Item: Debug
{
    ...
}

Or, we could write, “I must be an iterator over String values”:

fn dump<I>(iter: I)
where I: Iterator<Item=String>
{
    ...
}

Iterator<Item=String> is itself a trait. If you think of Iterator as the set of all iterator types, then Iterator<Item=String> is a subset of Iterator: the set of iterator types that produce Strings. This syntax can be used anywhere the name of a trait can be used, including trait object types:

fn dump(iter: &mut dyn Iterator<Item=String>) {
    for (index, s) in iter.enumerate() {
        println!("{}: {:?}", index, s);
    }
}

Generic Traits(How Operator Overloading Works)

Multiplication in Rust uses this trait:

/// std::ops::Mul, the trait for types that support `*`.
pub trait Mul<RHS> {
    /// The resulting type after applying the `*` operator
    type Output;

    /// The method for the `*` operator
    fn mul(self, rhs: RHS) -> Self::Output;
}

Mul is a generic trait. Mul is a generic trait, and its instances Mul<f64>, Mul<String>, Mul<Size>, etc., are all different traits, just as min::<i32> and min::<String> are different functions and Vec<i32> and Vec<String> are different types. The real Mul trait looks like this:

pub trait Mul<RHS=Self> {
	...
}

The syntax RHS=Self means that RHS defaults to Self. In a bound, if I write where T: Mul, it means where T: Mul<T>.

impl Trait

As you might imagine, combinations of many generic types can get messy. We could easily replace this hairy return type with a trait object:

fn cyclical_zip(v: Vec<u8>, u: Vec<u8>) -> Box<dyn Iterator<Item=u8>> {
    Box::new(v.into_iter().chain(u.into_iter()).cycle())
}

We can avoid the overhead of dynamic dispatch and an unavoidable heap allocation with impl Trait. impl Trait specify only the trait or traits it implements, without dynamic dispatch or a heap allocation.

fn cyclical_zip(v: Vec<u8>, u: Vec<u8>) -> impl Iterator<Item=u8> {
    v.into_iter().chain(u.into_iter()).cycle()
}

You might want to use different Shapes depending on a run-time value, like a string that a user enters. This doesn’t work with impl Shape as the return type:

trait Shape {
    fn new() -> Self;
    fn area(&self) -> f64;
}
fn make_shape(shape: &str) -> impl Shape {
    match shape {
        "circle" => Circle::new(),
        "triangle" => Triangle::new(), // error: incompatible types
        "shape" => Rectangle::new(),
    }
}

impl Trait is a form of static dispatch, so the compiler has to know the type being returned from the function at compile time in order to allocate the right amount of space on the stack and correctly access fields and methods on that type. That's why Rust doesn’t allow trait methods to use impl Trait return values.

Below functions are identical, but there is one important exception.

fn print<T: Display>(val: T) {
    println!("{}", val);
}
fn print(val: impl Display) {
    println!("{}", val);
}

Using generics allows callers of the function to specify the type of the generic arguments, like print::<i32>(42), while using impl Trait does not.

Associated Consts

Like structs and enums, traits can have associated constatants.

trait Greet {
    const GREETING: &'static str = "Hello";
    fn greet(&self) -> String;
}

Like associated types and functions, you can declare them but not give them a value:

trait Float {
    const ZERO: Self;
    const ONE: Self;
}

Then, implementors of the trait can define these values:

impl Float for f32 {
    const ZERO: f32 = 0.0;
    const ONE: f32 = 1.0;
}
impl Float for f64 {
    const ZERO: f64 = 0.0;
    const ONE: f64 = 1.0;
}

This allows you to write generic code that uses these values:

fn add_one<T: Float + Add<Output = T>>(value: T) -> T {
    value + T::ONE
}

Associated constants can’t be used with trait objects, since the compiler relies on type information about the implementation in order to pick the right value at compile time. In combination with a few operators, to implement common mathematical functions like Fibonacci:

fn fib<T: Float + Add<Output = T>>(n: usize) -> T {
    match n {
        0 => T::ZERO,
        1 => T::ONE,
        n => fib::<T>(n - 1) + fib::<T>(n - 2),
    }
}

고승우

٩( ᐛ )و

이전 포스트

[Rust] Patterns

다음 포스트