Rust supports polymorphism with two related features: traits and generics.
Most often, a trait represents a capability: something a type can do.
use std::io::Write;
let mut buf: Vec<u8> = vec![];
let writer: dyn Write = buf; // error: `Write` does not have a constant size
A variable’s size has to be known at compile time, and types that implement Write can be any size. References are explicit, so we can improve our code like this
let mut buf: Vec<u8> = vec![];
let writer: &mut dyn Write = &mut buf; // ok
A reference to a trait type, like writer, is called a trait object.
In memory, a trait object
is a fat pointer consisting of a pointer to the value, plus a pointer to a table representing that value’s type. Each trait object takes up two machine words.
In Rust, as in C++, the vtable
is generated once, at compile time, and shared by all objects of the same type. The language automatically uses the vtable
when you call a method of a trait object, to determine which implementation to call. The struct itself contains nothing but its fields. This way, a struct can implement dozens of traits without containing dozens of vptrs.
let mut local_file = File::create("hello.txt")?;
let w: Box<dyn Write> = Box::new(local_file);
Box<dyn Write>
, like &mut dyn Write
, is a fat pointer: it contains the address of the writer itself and the address of the vtable.
Let's compare these two functions
Uses a trait object
(&mut dyn Write
). This means out can be any type that implements Write
, but it's dynamically dispatched. The exact type is determined at runtime. This lead performance overhead due to dynamic dispatch(runtime type resolution).
fn say_hello(out: &mut dyn Write) // plain function
Trait objects
are the right choice whenever you need a collection of values of mixed types, all together. Since Vegetable values can be all different sizes, we can’t ask Rust for a Vec<dyn Vegetable>
:
struct Salad {
veggies: Vec<dyn Vegetable> // error: `dyn Vegetable` does
} // not have a constant size
Trait objects are the solution:
struct Salad {
veggies: Vec<Box<dyn Vegetable>>
}
Uses generics
(W: Write
). This means out is of a specific type that implements Write
, and it's statically dispatched. The exact type is determined at compile time.
fn say_hello<W: Write>(out: &mut W) // generic function
Generic functions can have multiple type parametres, and the bounds can get to be so long that they are hard on the eyes. Rust provides an alternative syntax using the keyword where:
fn run_query<M, R>(data: &DataSet, map: M, reduce: R) -> Results
where
M: Mapper + Serialize,
R: Reducer + Serialize
{ ... }
Generics have three important advantages over trait objects, with the result that in Rust, generics are the more common choice.
1. Speed: Specify the types at compile time, the compiler knows exactly which write method to call
2. Trait objects availability: TODO!
3. Easy to bound a generic type parameter with several traits at once: Trait objects can’t do this: types like &mut (dyn Debug + Hash + Eq)
aren’t supported in Rust
trait Write {
fn write(&mut self, buf: &[u8]) -> Result<usize>; fn flush(&mut self) -> Result<()>;
fn write_all(&mut self, buf: &[u8]) -> Result<()> {
let mut bytes_written = 0;
while bytes_written < buf.len() {
bytes_written += self.write(&buf[bytes_written..])?;
}
Ok(())
}
...
}
In this example, the write
and flush
methods are the basic methods
that every writer must implement. A writer may also implement write_all
, but if not, the default implementation shown earlier will be used.
The trait adds a method to an existing type is called extension trait
. Implementing the trait for all writers makes it an extension trait, adding a method to all Rust writers:
/// You can write HTML to any std::io writer.
impl<W: Write> WriteHtml for W {
fn write_html(&mut self, html: &HtmlDocument) -> io::Result<()> { ... }
}
The line impl<W: Write> WriteHtml for W
means “for every type W that implements Write, here’s an implementation of WriteHtml for W.”
when you implement a trait, either the trait or the type must be new in the current crate. It helps Rust ensure that trait implementations are unique.
A trait can use the keyword Self as a type:
impl Spliceable for CherryTree {
fn splice(&self, other: &Self) -> Self { ... }
}
impl Spliceable for Mammoth {
fn splice(&self, other: &Self) -> Self { ... }
}
Inside the first impl, Self is simply an alias for CherryTree, and in the second, it’s an alias for Mammoth.
A trait that uses the Self type is incompatible with trait objects:
// error: the trait `Spliceable` cannot be made into an object
fn splice_anything(left: &dyn Spliceable, right: &dyn Spliceable) {
let combo = left.splice(right);
// ...
}
Rust rejects this code because it has no way to type-check the call left.splice(right)
. The whole point of trait objects is that the type isn’t known until run time. Now, had we wanted genetically improbable splicing, we could have designed a trait- object-friendly trait:
pub trait MegaSpliceable {
fn splice(&self, other: &dyn MegaSpliceable) -> Box<dyn MegaSpliceable>;
}
We can declare that a trait is an extension of another trait:
/// Someone in the game world, either the player or some other
/// pixie, gargoyle, squirrel, ogre, etc.
trait Creature: Visible {
fn position(&self) -> (i32, i32);
fn facing(&self) -> Direction;
...
}
The phrase trait Creature: Visible
means that all creatures are visible. Every type that implements Creature
must also implement the Visible
trait. It’s an error to implement Creature
for a type without also implementing Visible
. We say that Creature
is a subtrait
of Visible
, and that Creature
is Visible
’s supertrait
.
Traits can include type-associated functions, Rust’s analog to static methods:
trait StringSet {
/// Return a new empty set.
fn new() -> Self;
/// Return a set that contains all the strings in `strings`.
fn from_slice(strings: &[&str]) -> Self;
/// Find out if this set contains a particular `value`.
fn contains(&self, string: &str) -> bool;
/// Add a string to this set.
fn add(&mut self, string: &str);
}
The first two, new()
and from_slice()
, don’t take a self argument. They serve as constructors. In nongeneric code, these functions can be called using ::
syntax, just like any other type-associated function:
// Create set of hypothetical types that impl StringSet:
let mut my_set = MyStringSet::new();
In generic code, it’s the same, except the type is often a type variable, as in the call to S::new()
shown here:
fn unknown_words<S: StringSet>(document: &[String], wordlist: &S) -> S {
S::new()
}
Trait objects don’t support type-associated functions. If you want to use &dyn StringSet
trait objects, you must change the trait, adding the bound where Self: Sized
to each associated function that doesn’t take a self argument by reference:
trait StringSet {
fn new() -> Self
where
Self: Sized;
fn from_slice(strings: &[&str]) -> Self
where
Self: Sized;
fn contains(&self, string: &str) -> bool;
fn add(&mut self, string: &str);
}
And then you can use trait object like this:
/// Suppose MyStringSet implemented StringSet trait
let mut my_set: Box<dyn StringSet> = Box::new(MyStringSet::new());
my_set.add("world");
println!("{}", my_set.contains("world")); // true
Rust has a standard Iterator trait, defined like this:
pub trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
...
}
The first feature of this trait, type Item;
, is an associated type. Each type that implements Iterator
must specify what type of item it produces. next()
returns an Option<Self::Item>
: either Some(item)
, the next value in the sequence, or None when there are no more values to visit. The type is written as Self::Item
, not just plain Item
, because Item
is a feature of each type of iterator, not a standalone type.
// (code from the std::env standard library module)
impl Iterator for Args {
type Item = String;
fn next(&mut self) -> Option<String> { ... }
...
}
Also, generic code can use associated types:
/// Loop over an iterator, storing the values in a new vector.
fn collect_into_vector<I: Iterator>(iter: I) -> Vec<I::Item> {
let mut results = Vec::new();
for value in iter {
results.push(value);
}
results
}
We can place a bound on I::Item:
fn dump<I>(iter: I)
where I: Iterator, I::Item: Debug
{
...
}
Or, we could write, “I must be an iterator over String values”:
fn dump<I>(iter: I)
where I: Iterator<Item=String>
{
...
}
Iterator<Item=String>
is itself a trait. If you think of Iterator
as the set of all iterator types, then Iterator<Item=String>
is a subset of Iterator
: the set of iterator types that produce Strings. This syntax can be used anywhere the name of a trait can be used, including trait object types:
fn dump(iter: &mut dyn Iterator<Item=String>) {
for (index, s) in iter.enumerate() {
println!("{}: {:?}", index, s);
}
}
Multiplication in Rust uses this trait:
/// std::ops::Mul, the trait for types that support `*`.
pub trait Mul<RHS> {
/// The resulting type after applying the `*` operator
type Output;
/// The method for the `*` operator
fn mul(self, rhs: RHS) -> Self::Output;
}
Mul
is a generic trait. Mul is a generic trait, and its instances Mul<f64>
, Mul<String>
, Mul<Size>
, etc., are all different traits, just as min::<i32>
and min::<String>
are different functions and Vec<i32>
and Vec<String>
are different types. The real Mul
trait looks like this:
pub trait Mul<RHS=Self> {
...
}
The syntax RHS=Self
means that RHS
defaults to Self
. In a bound, if I write where T: Mul
, it means where T: Mul<T>
.
As you might imagine, combinations of many generic types can get messy. We could easily replace this hairy return type with a trait object:
fn cyclical_zip(v: Vec<u8>, u: Vec<u8>) -> Box<dyn Iterator<Item=u8>> {
Box::new(v.into_iter().chain(u.into_iter()).cycle())
}
We can avoid the overhead of dynamic dispatch and an unavoidable heap allocation with impl Trait
. impl Trait
specify only the trait or traits it implements, without dynamic dispatch or a heap allocation.
fn cyclical_zip(v: Vec<u8>, u: Vec<u8>) -> impl Iterator<Item=u8> {
v.into_iter().chain(u.into_iter()).cycle()
}
You might want to use different Shapes depending on a run-time value, like a string that a user enters. This doesn’t work with impl Shape
as the return type:
trait Shape {
fn new() -> Self;
fn area(&self) -> f64;
}
fn make_shape(shape: &str) -> impl Shape {
match shape {
"circle" => Circle::new(),
"triangle" => Triangle::new(), // error: incompatible types
"shape" => Rectangle::new(),
}
}
impl Trait
is a form of static dispatch, so the compiler has to know the type being returned from the function at compile time in order to allocate the right amount of space on the stack and correctly access fields and methods on that type. That's why Rust doesn’t allow trait methods to use impl Trait
return values.
Below functions are identical, but there is one important exception.
fn print<T: Display>(val: T) {
println!("{}", val);
}
fn print(val: impl Display) {
println!("{}", val);
}
Using generics allows callers of the function to specify the type of the generic arguments, like print::<i32>(42)
, while using impl Trait
does not.
Like structs and enums, traits can have associated constatants.
trait Greet {
const GREETING: &'static str = "Hello";
fn greet(&self) -> String;
}
Like associated types and functions, you can declare them but not give them a value:
trait Float {
const ZERO: Self;
const ONE: Self;
}
Then, implementors of the trait can define these values:
impl Float for f32 {
const ZERO: f32 = 0.0;
const ONE: f32 = 1.0;
}
impl Float for f64 {
const ZERO: f64 = 0.0;
const ONE: f64 = 1.0;
}
This allows you to write generic code that uses these values:
fn add_one<T: Float + Add<Output = T>>(value: T) -> T {
value + T::ONE
}
Associated constants
can’t be used with trait objects
, since the compiler relies on type information about the implementation in order to pick the right value at compile time. In combination with a few operators, to implement common mathematical functions like Fibonacci:
fn fib<T: Float + Add<Output = T>>(n: usize) -> T {
match n {
0 => T::ZERO,
1 => T::ONE,
n => fib::<T>(n - 1) + fib::<T>(n - 2),
}
}