2023/10/08
Using Unsafe Rust? You'll Probably Trigger Undefined Behaviour
TLDR: unsafe rust has many UB pitfalls. Use miri to ensure you're not falling into any of them
Inspired by
TigerBeetle's design of no runtime memory allocations, I decided to build a zero runtime memory allocation data structure in
rust. It seemed like a least-recently-used (LRU) cache was a natural fit
for such a requirement, since they typically involve a fixed capacity,
beyond which the least-recently-used element was evicted to make room
for a new one. Furthermore, I had some needs for one in another project,
so all the stars were aligned to make
const-lru
happen.
What the stars were not aligned on, however, was the amount of undefined
behaviour (UB) I would inadvertently introduce in the initial
development of const-lru
.
Uninitialized memory is a UB minefield
Since the capacity of the LRU cache was designed to be fixed at compile-time, I decided to store the required data in a couple of const-generic array fields. In other words, something like:
pub struct ConstLru<K, V, const CAP: usize> {
keys: [K; CAP],
values: [V; CAP],
... // other fields
}
This was all fine and dandy for small values of CAP
, but at
larger values, this was causing runtime stack overflows.
Box::new(ConstLru::new())
did not help either because
semantically this meant "create a new ConstLru
on the stack
and then move that ConstLru
into a newly allocated region
in the heap", so the stack overflow still occurred in the first step.
Since I wanted to keep the crate
no-std
and not dependent on alloc
, I decided
to create a new unsafe
function,
init_at_alloc(ptr: *mut Self)
, to allow users to create
large CAP
ConstLru
s at pre-allocated memory.
This allows for stuff like:
let layout = Layout::new::<ConstLru<u32, u16, 10_000, u16>>();
let container: Box<ConstLru<u32, u16, 10_000, u16>> = unsafe {
let ptr = alloc(layout) as *mut ConstLru<u32, u16, 10_000, u16>;
ConstLru::init_at_alloc(ptr);
Box::from_raw(ptr)
};
I could then run a common init procedure for both stack and pointer
initializations in order to set the correct starting state for the
struct's fields. Surely a private fn init(&mut self)
, where
the pointer initialization is:
pub unsafe fn init_at_alloc(ptr: *mut Self) {
let mut_ref = &mut *ptr;
mut_ref.init();
}
and the stack initialization is:
pub fn new() -> Self {
let mut res: MaybeUninit<Self> = MaybeUninit::uninit();
unsafe {
Self::init_at_alloc(res.as_mut_ptr());
res.assume_init()
}
}
would work, right?
Pointers may not always become references
WRONG! After digging a little deeper into the
docs for MaybeUninit
, it turns out that
turning a pointer into a reference before the memory pointed to is
initialized is undefined behaviour!
Oh no. Naturally, I had to change the function signature from
fn init(&mut self)
to fn init(ptr: *mut Self)
,
which meant init_at_alloc(ptr: *mut Self)
became the common
init procedure instead. It was kinda ugly changing all the
self.field = init_value
to
addr_of_mut!((*ptr).field).write(init_value)
, but it
worked.
Always use addr_of_mut!
to initialize uninitialized struct
fields
The issue was resolved but the experience left me distraught. Rust, for
all its virtues of stringent compile-time checking had failed to save me
from myself. Besides, how many more instances of UB were left in my
code? Surely there was a way to detect and fix them without going
through the docs of every unsafe
feature I was using.
Just use miri
Thankfully, there was. After some googling, I found out about miri, and it immediately proved to be a godsend. Here are some other instances of UB it caught:
Once you assume_init()
, all fields MUST be initialized
In order to speed up the lookup process, ConstLru
keeps a
binary search index array as one of its fields, bs_index
.
Since the array consists of primitive unsigned integer values, I decided
not to encapsulate the values in a MaybeUninit
. As the
program does not ever use any values in the array without initializing
them, I thought it was fine to leave the field uninitialized when the
ConstLru
was created. After all, it was just a bunch of
unused random integers. However, this is UB according to miri:
error: Undefined Behavior: constructing invalid value at .value.bs_index[11]: encountered uninitialized bytes
--> /home/user/const-lru/src/lib.rs:72:13
|
72 | res.assume_init()
| ^^^^^^^^^^^^^^^^^ constructing invalid value at .value.bs_index[11]: encountered uninitialized bytes
In hindsight, this is perfectly reasonable UB, since the compiler does
not, and probably should not, treat primitive integers as a special case
for allowing uninitialized fields; imagine what would happen if instead
of a primitive integer, this was some other type with complex
Drop
behaviour like Vec
. If this wasn't UB,
the compiler will generate code that attempts to free memory at random
addresses read from the uninitialized bytes!
The fix for this UB was simple: just initialize the values. Though this does introduce some overhead at initialization.
&array[i]
and &mut array[i]
only borrows the
single i-th element of the array
The above statement sounds obvious, so how can it result in UB? Well,
what if you wanted to use ptr::copy()
to shift an array by
1 element, like in Vec::remove()
?
// shift everything between [bs_i, len) right
let bs_i_ptr: *mut I = &mut self.bs_index[bs_i];
unsafe {
ptr::copy(bs_i_ptr, bs_i_ptr.add(1), len - bs_i);
}
bs_i_ptr
only mutably borrows
self.bs_index[bs_i]
. However, the
ptr::copy()
is attempting to mutate not
self.bs_index[bs_i]
but
self.bs_index[bs_i+1..len+1]
instead! miri catches this:
error: Undefined Behavior: attempting a write access using <154249> at alloc50498[0x1c], but that tag does not exist in the borrow stack for this location
--> /home/user/const-lru/src/lib.rs:348:17
|
348 | ptr::copy(bs_i_ptr, bs_i_ptr.add(1), l - bs_i);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| attempting a write access using <154249> at alloc50498[0x1c], but that tag does not exist in the borrow stack for this location
| this error occurs as part of an access at alloc50498[0x1c..0x1d]
Admittedly, this is not immediately obvious since the multiple writes
are somewhat obscured by the ptr::copy()
call. I only
understood the miri error message after chancing upon
this rust forum post
The "fix" for this UB wasn't really a fix. Since
ptr::copy()
is inherently an unsafe function, it makes
sense to just express the procedure using fully unsafe types, i.e.
pointers instead of references:
// shift everything between [bs_i, len) right
unsafe {
let bs_i_ptr = self.bs_index.as_mut_ptr().add(bs_i);
ptr::copy(bs_i_ptr, bs_i_ptr.add(1), len - bs_i);
}
Conclusion
Rust is a safe language but unsafe rust is a totally different beast. Undefined behaviour abounds when you're dealing with raw pointers and uninitialized memory, but somebody has to do it to create the nice safe APIs. Luckily, miri makes avoiding unintentional UB a lot easier.