Rust's memory model.
Whilst I can write about 10 more blog posts on why Unix and its philosophy of "do one thing and do it well" works great in practice, I want to discuss Rust and its interesting memory model and contrast it with C.
No more leaks.
Rust, by design, is meant to eliminate a particular class of CVE errors, all pertaining to bad memory management. C is more versatile and that is true, but if memory safety (and concurrency) are more desirable (in, say, a huge project with multiple collaborators), Rust is the better option.
Rust enforces memory safety by having a unique memory model which is also enforced by its compiler.
Stack and heap?
Unlike C, Rust doesn't give you direct control over where a particular variable goes on the stack or on the heap. Types with a known fixed size at compile time go on the stack, and these include the builtin types, fixed size arrays and structs whose field sizes are already known.
Following the reasoning here, types whose sizes are dynamic or not determinable at compile time are allocated on the heap. This includes:
String, Rust has a builtin string type.Vec<T>, Dynamic arrays.dyn Trait, Dynamic trait interfaces, where it is not know which struct is being referred to.Box<T>, Smart pointers (the Box itself is stored on stack, the type T is stored on the heap).
Where my free() at?
Rust has an extremely simple rule when it comes to memory management: If it goes out of scope, it's gone from memory.
{
let s = String::from("hello");
} // since s goes out of scope here, it is automatically freed
Since the compiler keeps track of every variable, and who owns each
variable, the compiler knows when the heap memory needs to be freed.
It automatically calls the drop() method at the end of scope (the
heap allocated object interface must implement the Drop trait).
Contrast this with C:
{
char *s = malloc(6 * sizeof(char));
strcpy(s, "hello");
free(s); // must be manually freed
}
The free() call must be manually written by the programmer. This allows
for more flexibility but there are 3 undesirable cases (all caused by the
programmer):
- Not freeing the memory at all. This causes a memory leak.
- Freeing twice. Freeing already freed memory is undefined behaviour and should be avoided.
- Use after free. This occurs when you forget that you have freed memory and try using it afterwards.
Ownership.
Ownership is the central concept in Rust. Only a single variable can "own" heap data at a time. There are two operations regarding ownership.
Moving.
A "move" transfers ownership entirely from one variable to another.
let s1 = String::from("hello");
let s2 = s1; // this is a *move*
// s1 is now invalid.
s1 is now invalid and the compiler will throw an error if you try using s1
after the move. s1 doesn't own the String anymore and the compiler does not
allow you to access it.
Functions can also give and take ownership.
fn take_ownership(s: String) {
println!("{}", s);
// since scope of s ends here, it is *freed*
}
fn give_ownership() -> String {
let s = String::from("hello");
s // we give ownership to the caller
}
Moving is bidirectional. To move multiple values outside a function, you must return a tuple.
Borrowing.
Borrowing is giving temporary access to an object to another variable without losing ownership. Any borrowed object will not be freed.
// &T means a borrowed type T.
fn borrow_it(s: &String) {
println!("{}", s);
// s is *not* freed here because it is borrowed.
}
Since borrowed objects are distinct from owned object (&T vs T), the
compiler can easily keep track of them.
Mutable or immutable borrowing.
Rust is also designed for easy concurreny. To prevent data races across multiple threads, the compiler enforces this rule:
Either only multiple immutable borrows (
&T) can exist, or a single mutable borrow (&mut T).
Borrow checker rules.
The Rust compiler has a borrow checker which enforces certain rules on borrowing. Here are they:
Rule 1: One owner only.
Every value has only one owner at any given point of time. When ownership moves, the previous owner is invalidated.
Rule 2: Multiple immutable borrows xor a single mutable borrow.
As explained above:
Either only multiple immutable borrows (&T) can exist, or a single
mutable borrow (&mut T).
Rule 3: Borrows must not outlive the owner.
Best explained with this example:
fn dangle() -> &String {
let s = String::from("hello");
&s // since s is freed when returning (as it goes out of scope), we
// cannot borrow here. the compiler throws an error.
}
Rule 4: Borrows end at their last use.
The compiler "looks ahead" to see if a borrow is referred ahead. If it isn't, then the borrow is over at the point where it is last used. This is called "Non Lexical Lifetime" (NLL).
Conclusion.
The rules are strict, but the resulting code is completely memory safe. C assumes the programmer is a master and gives all control to us. Rust assumes the programmer does not know better and gives full safety to us. It is all a compromise.