Saturday, September 13, 2014

A gotcha with raw pointers and unsafe code

This bit me today. It's not actually a bug and it only happens in unsafe code, but it is non-obvious and something to be aware of.

There are a few components to the issue. First off, we must look at `&expr` where `expr` is an rvalue, that is a temporary value. Rust allows you to write (for example) `&42` and through some magic, `42` will be allocated on the stack and `x` will be a reference to it with an inferred lifetime shorter than the value. For example, `let x = &42i;` works as does

struct Foo<'a> {
    f: &'a int,
}
fn main() {
    let x = Foo { f: &42 };
}

Next, we must know that borrowed pointers (`&T`) can be implicitly coerced to raw pointers (`*T`). So if you write `let x: *const int = &42;`, `x` is a raw pointer produced by coercing the borrowed pointer. Once this happens, you have no safety guarantees - a raw pointer can point at memory that has already been freed. This is fine, since you see the raw pointer type and must be aware, but if the type comes from a struct field what looks like a borrowed pointer could actually be a raw pointer:

struct Bar {
    f: *const int,
}
fn main() {
    let x = Bar { f: &42 };
}

Imagine that `Bar` is some other module or crate, then you might assume that `main` is ok here. But it is not. Since the borrowed pointer has a narrow scope and is not stored (the raw pointer does not count for this analysis), the compiler can choose to delete the `42` allocated on the stack and reuse that memory straight after (or even during, probably) the `let` statement. So, `x.f` is potentially a dangling pointer as soon as `x` is available and accessing it will give you bugs. That is OK, you can only do so in an `unsafe` block and thus you should check (i.e., as a programmer you should check) that you can't get a dangling pointer. You must do this whenever you dereference a raw pointer, and the fact that it must happen in unsafe code is your cue to do so.

The final part of this gotcha was that I was already in unsafe code and was transmuting. Of course transmuting is awful and you should never do it, but sometimes you have to. If you do `unsafe { transmute(x) }` then there is no cue in the code that you have a raw pointer. You have no cue to check the dereference, because there is no dereference! You just get a weird bug that only appears on some platforms and depends on the optimisation level of compilation.

Unfortunately, there is nothing we can really do from the language point of view - you just have to be super-careful around unsafe code, and especially transmutes.

Hat-tip to eddyb for figuring out what was going on here.