lang | libs (unsafe | references | convention | raw-pointers)
Establish a convention throughout the core libraries for unsafe functions constructing references out of raw pointers. The goal is to improve usability while promoting awareness of possible pitfalls with inferred lifetimes.
The current library convention on functions constructing borrowed values from raw pointers has the pointer passed by reference, which reference's lifetime is carried over to the return value. Unfortunately, the lifetime of a raw pointer is often not indicative of the lifetime of the pointed-to data. So the status quo eschews the flexibility of inferring the lifetime from the usage, while falling short of providing useful safety semantics in exchange.
A typical case where the lifetime needs to be adjusted is in bindings to a foreign library, when returning a reference to an object's inner value (we know from the library's API contract that the inner data's lifetime is bound to the containing object):
impl Outer {
fn inner_str(&self) -> &[u8] {
unsafe {
let p = ffi::outer_get_inner_str(&self.raw);
let s = std::slice::from_raw_buf(&p, libc::strlen(p));
std::mem::copy_lifetime(self, s)
}
}
}
Raw pointer casts also discard the lifetime of the original pointed-to value.
The signature of from_raw*
constructors will be changed back to what it
once was, passing a pointer by value:
unsafe fn from_raw_buf<'a, T>(ptr: *const T, len: uint) -> &'a [T]
The lifetime on the return value is inferred from the call context.
The current usage can be mechanically changed, while keeping an eye on possible lifetime leaks which need to be worked around by e.g. providing safe helper functions establishing lifetime guarantees, as described below.
In many cases, the lifetime parameter will come annotated or elided from the call context. The example above, adapted to the new convention, is safe despite lack of any explicit annotation:
impl Outer {
fn inner_str(&self) -> &[u8] {
unsafe {
let p = ffi::outer_get_inner_str(&self.raw);
std::slice::from_raw_buf(p, libc::strlen(p))
}
}
}
In other cases, the inferred lifetime will not be correct:
let foo = unsafe { ffi::new_foo() };
let s = unsafe { std::slice::from_raw_buf(foo.data, foo.len) };
// Some lines later
unsafe { ffi::free_foo(foo) };
// More lines later
let guess_what = s[0];
// The lifetime of s is inferred to extend to the line above.
// That code told you it's unsafe, didn't it?
Given that the function is unsafe, the code author should exercise due care when using it. However, the pitfall here is not readily apparent at the place where the invalid usage happens, so it can be easily committed by an inexperienced user, or inadvertently slipped in with a later edit.
To mitigate this, the documentation on the reference-from-raw functions
should include caveats warning about possible misuse and suggesting ways to
avoid it. When an 'anchor' object providing the lifetime is available, the
best practice is to create a safe helper function or method, taking a
reference to the anchor object as input for the lifetime parameter, like in
the example above. The lifetime can also be explicitly assigned with
std::mem::copy_lifetime
or std::mem::copy_lifetime_mut
, or annotated when
possible.
To improve composability in cases when the lifetime does need to be assigned
explicitly, the first parameter of std::mem::copy_mut_lifetime
should be made an immutable reference. There is no reason for the lifetime
anchor to be mutable: the pointer's mutability is usually the relevant
question, and it's an unsafe function to begin with. This wart may
breed tedious, mut-happy, or transmute-happy code, when e.g. a container
providing the lifetime for a mutable view into its contents is not itself
necessarily mutable.
The implicitly inferred lifetimes are unsafe in sneaky ways, so care is required when using the borrowed values.
Changing the existing functions is an API break.
An earlier revision of this RFC proposed adding a generic input parameter to determine the lifetime of the returned reference:
unsafe fn from_raw_buf<'a, T, U: Sized?>(ptr: *const T, len: uint,
life_anchor: &'a U)
-> &'a [T]
However, an object with a suitable lifetime is not always available
in the context of the call. In line with the general trend in Rust libraries
to favor composability, std::mem::copy_lifetime
and
std::mem::copy_lifetime_mut
should be the principal methods to explicitly
adjust a lifetime.
Should the change in function parameter signatures be done before 1.0?
Thanks to Alex Crichton for shepherding this proposal in a constructive and timely manner. He has in fact rationalized the convention in its present form.