lang | libs (conversions | machine | cell)
Cell<T>
to allow T: ?Sized
.T
and Cell<T>
have the same memory layout.&mut T -> &Cell<T> where T: ?Sized
&Cell<[T]> -> &[Cell<T>]
Note: https://github.com/rust-lang/rfcs/pull/1651 has been accepted recently, so no
T: Copy
bound is needed anymore.
Rust's iterators offer a safe, fast way to iterate over collections while avoiding additional bound checks.
However, due to the borrow checker, they run into issues if we try to have more than one iterator into the same data structure while mutating elements in it.
Wanting to do this is not that unusual for many low level algorithms that deal with integers, floats or similar primitive data types.
For example, an algorithm might...
Todays answer for algorithms like that is to fall back to C-style for loops and indexing, which might look like this...
let v: Vec<i32> = ...;
// example 1
for i in 0..v.len() {
for j in 0..v.len() {
v[j] = f(v[i], v[j]);
}
}
// example 2
for i in n..v.len() {
v[i] = g(v[i - n]);
}
...but this reintroduces potential bound-checking costs.
The alternative, short of changing the actual algorithms involved, is to use internal mutability to enable safe mutations even with overlapping shared views into the data:
let v: Vec<Cell<i32>> = ...;
// example 1
for i in &v {
for j in &v {
j.set(f(i.get(), j.get()));
}
}
// example 2
for (i, j) in v[n..].iter().zip(&v) {
i.set(g(g.get()));
}
This has the advantages of allowing both bound-check free iteration and aliasing references, but comes with restrictions that makes it not generally applicable, namely:
&T
and &mut T
references to the data.This RFC proposes a way to address these in cases where Cell<T>
could be used by introducing simple conversions functions
to the standard library that allow the creation of shared borrowed
Cell<T>
s from mutably borrowed T
s.
This in turn allows the original data structure to remain unchanged,
while allowing to temporary opt-in to the Cell
API as needed.
As an example, given Cell::from_mut_slice(&mut [T]) -> &[Cell<T>]
,
the previous examples can be written as this:
let mut v: Vec<i32> = ...;
// convert the mutable borrow
let v_slice: &[Cell<T>] = Cell::from_mut_slice(&mut v);
// example 1
for i in v_slice {
for j in v_slice {
j.set(f(i.get(), j.get()));
}
}
// example 2
for (i, j) in v_slice[n..].iter().zip(v_slice) {
i.set(g(g.get()));
}
The core of this proposal is the ability to convert a &T
to a &Cell<T>
,
so in order for it to be safe, it needs to be guaranteed that
T
and Cell<T>
have the same memory layout, and that there are no codegen
issues based on viewing a reference to a type that does not contain a
UnsafeCell
as a reference to a type that does contain a UnsafeCell
.
As far as the author is aware, both should already implicitly
fall out of the semantic of Cell
and Rusts/llvms notion of aliasing:
Cell
is safe interior mutability based on memcopying the T
,
and thus does not need additional fields or padding.&mut T -> &U
is a sub borrow, which prevents access to the original &mut T
for its duration, thus no aliasing.from_mut
We add a constructor to the cell API that enables the &mut T -> &Cell<T>
conversion, implemented with the equivalent of a transmute()
of the two
pointers:
impl<T> Cell<T> {
fn from_mut<'a>(t: &'a mut T) -> &'a Cell<T> {
unsafe {
&*(t as *mut T as *const Cell<T>)
}
}
}
In the future this could also be provided through AsRef
, Into
or From
impls.
Cell<T>
We extend Cell<T>
to allow T: ?Sized
, and move all compatible methods
to a less restricted impl block:
pub struct Cell<T: ?Sized> {
value: UnsafeCell<T>,
}
impl<T: ?Sized> Cell<T> {
pub fn as_ptr(&self) -> *mut T;
pub fn get_mut(&mut self) -> &mut T;
pub fn from_mut(value: &mut T) -> &Cell<T>;
}
This is purely done to enable cell slicing below, and should otherwise have no effect on any existing code.
We enable a conversion from &Cell<[T]>
to &[Cell<T>]
. This seems like it violates
the "no interior references" API of Cell
at first glance, but is actually
safe:
&mut T -> &Cell<T>
is ok, then &mut [T] -> &[Cell<T>]
would be as well.
&mut [T] -> &Cell<[T]>
follows from &mut T -> &Cell<T>
through substitution,
so &Cell<[T]> <-> &[Cell<T>]
has to be valid.Cell<T>
is to allow internal mutability through single-threaded
memcopies only. Since a memcopy is just a copy of all bits that make up a type,
it does not matter if we logically do a memcopy to all elements of a slice
through a &Cell<[T]>
, or just a memcopy to a single element through a
&Cell<T>
.&mut T
to each element
of a &mut [T]
, and converted each of them to a &Cell<T>
, their addresses
would allow "stitching" them back together to a single &[Cell<T>]
For convenience, we expose this conversion by implementing Index
for Cell<[T]>
:
impl<T> Index<RangeFull> for Cell<[T]> {
type Output = [Cell<T>];
fn index(&self, _: RangeFull) -> &[Cell<T>] {
unsafe {
&*(self as *const Cell<[T]> as *const [Cell<T>])
}
}
}
impl<T> Index<Range<usize>> for Cell<[T]> {
type Output = [Cell<T>];
fn index(&self, idx: Range<usize>) -> &[Cell<T>] {
&self[..][idx]
}
}
impl<T> Index<RangeFrom<usize>> for Cell<[T]> {
type Output = [Cell<T>];
fn index(&self, idx: RangeFrom<usize>) -> &[Cell<T>] {
&self[..][idx]
}
}
impl<T> Index<RangeTo<usize>> for Cell<[T]> {
type Output = [Cell<T>];
fn index(&self, idx: RangeTo<usize>) -> &[Cell<T>] {
&self[..][idx]
}
}
impl<T> Index<usize> for Cell<[T]> {
type Output = Cell<T>;
fn index(&self, idx: usize) -> &Cell<T> {
&self[..][idx]
}
}
Using this, the motivation example can be written as such:
let mut v: Vec<i32> = ...;
// convert the mutable borrow
let v_slice: &[Cell<T>] = &Cell::from_mut(&mut v[..])[..];
// example 1
for i in v_slice {
for j in v_slice {
j.set(f(i.get(), j.get()));
}
}
// example 2
for (i, j) in v_slice[n..].iter().zip(v_slice) {
i.set(g(g.get()));
}
The proposal only covers the base case &mut T -> &Cell<T>
and the trivially implementable extension to [T]
,
but in theory this conversion could be enabled for
many "higher level mutable reference" types, like for example
mutable iterators (with the goal of making them cloneable through this).
See https://play.rust-lang.org/?gist=d012cebf462841887323185cff8ccbcc&version=stable&backtrace=0 for an example implementation and a more complex use case, and https://crates.io/crates/alias for an existing crate providing these features.
What names and terminology work best for these concepts and why? How is this idea best presented—as a continuation of existing Rust patterns, or as a wholly new one?
The API could be described as "temporarily opting-in to internal mutability".
It would be a more flexible continuation of the existing usage of Cell<T>
since the Cell<T>
no longer needs to exist in the original location if
you have mutable access to it.
Would the acceptance of this proposal change how Rust is taught to new users at any level? How should this feature be introduced and taught to existing Rust users?
As it is, the API just provides a few neat conversion functions. Nevertheless,
with the legalization of the &mut T -> &Cell<T>
conversion there is the
potential for a major change in how accessors to data structures are provided:
In todays Rust, there are generally three different ways:
T
and yield U
.&T
and yields &U
.&mut T
and yields &mut U
.With this change, it would be possible in many cases to add a fourth accessor:
&mut T
and yields &Cell<U>
.For example, today there exist:
Vec<T> -> std::vec::IntoIter<T>
, which yields T
values and is cloneable.&[T] -> std::slice::Iter<T>
, which yields &T
values and is
cloneable because it does a shared borrow.&mut [T] -> std::slice::IterMut<T>
, which yields &mut T
values and is
not cloneable because it does a mutable borrow.We could then add a fourth iterator like this:
&mut [T] -> std::slice::CellIter<T>
, which yields &Cell<T>
values and is
cloneable because it does a shared borrow.So there is the potential that we go away from teaching the "rule of three" of ownership and change it to a "rule of four".
What additions or changes to the Rust Reference, The Rust Programming Language, and/or Rust by Example does it entail?
&mut T -> &Cell<T>
conversion,
or specifically the &mut T -> &UnsafeCell<T>
conversion is fine.Why should we not do this?
Cell
API.T
-> Cell<T>
transmute compatibility might not be a desired guarantee.Instead of allowing unsized types in Cell
and adding the Index
impls,
there could just be a single &mut [T] -> &[Cell<T>]
conversions function:
impl<T> Cell<T> {
/// [...]
fn from_mut_slice<'a>(t: &'a mut [T]) -> &'a [Cell<T>] {
unsafe {
&*(t as *mut [T] as *const [Cell<T>])
}
}
}
Usage:
let mut v: Vec<i32> = ...;
// convert the mutable borrow
let v_slice: &[Cell<T>] = Cell::from_mut_slice(&mut v);
// example 1
for i in v_slice {
for j in v_slice {
j.set(f(i.get(), j.get()));
}
}
// example 2
for (i, j) in v_slice[n..].iter().zip(v_slice) {
i.set(g(g.get()));
}
This would be less modular than the &mut [T] -> &Cell<[T]> -> &[Cell<T>]
conversions steps, while still offering essentially the same API.
The conversion could be guaranteed as correct, but not be provided by std itself. This would serve as legitimization of external implementations like alias.
If the safety guarantees of the conversion can not be granted, code would have to use direct indexing as today, with either possible bound checking costs or the use of unsafe code to avoid them.
Index
impls with Deref
Instead of the Index
impls, have only this Deref
impl:
impl<T> Deref for Cell<[T]> {
type Target = [Cell<T>];
fn deref(&self) -> &[Cell<T>] {
unsafe {
&*(self as *const Cell<[T]> as *const [Cell<T>])
}
}
}
Pro:
[T]
.Cons:
Cell<[T]> -> [Cell<T>]
conversion does not seem like a good usecase
for Deref
, since Cell<[T]>
isn't a smartpointer.&mut Cell<T>
instead of &Cell<T>
Nothing that makes the &mut T -> &Cell<T>
conversion safe would prevent
&mut T -> &mut Cell<T>
from being safe either, and the latter can be
trivially turned into a &Cell<T>
while also allowing mutable access - eg to
call Cell::as_mut()
to conversion back again.
Similar to that, there could also be a way to turn a &mut [Cell<T>]
back
into a &mut [T]
.
However, this does not seem to be actually useful since the only reason to use this API is to make use of shared internal mutability.
Instead of Cell
constructors, we could just have freestanding functions
in, say, std::cell
:
fn ref_as_cell<T>(t: &mut T) -> &Cell<T> {
unsafe {
&*(t as *mut T as *const Cell<T>)
}
}
fn cell_slice<T>(t: &Cell<[T]>) -> &[Cell<T>] {
unsafe {
&*(t as *const Cell<[T]> as *const [Cell<T>])
}
}
On the opposite spectrum, and should this feature end up being used
somewhat commonly,
we could provide the conversions by dedicated traits,
possibly in the prelude, or use the std coherence hack to implement
them directly on &mut T
and &mut [T]
:
trait AsCell {
type Cell;
fn as_cell(self) -> Self::Cell;
}
impl<'a, T> AsCell for &'a mut T {
type Cell = &'a Cell<T>;
fn as_cell(self) -> Self::Cell {
unsafe {
&*(self as *mut T as *const Cell<T>)
}
}
}
But given the issues of adding methods to pointer-like types,
this approach in general would probably be not a good idea
(See the situation with Rc
and Arc
).
None so far.