libs (repr | machine)
Deprecate type aliases and structs in std::os::$platform::raw
in favor of
trait-based accessors which return Rust types rather than the equivalent C type
aliases.
RFC 517 set forth a vision for the raw
modules in the standard
library to perform lowering operations on various Rust types to their platform
equivalents. For example the fs::Metadata
structure can be lowered to the
underlying sys::stat
structure. The rationale for this was to enable building
abstractions externally from the standard library by exposing all of the
underlying data that is obtained from the OS.
This strategy, however, runs into a few problems:
stat
, there's not actually one canonical
definition. For example on 32-bit Linux the definition of stat
will change
depending on whether LFS is enabled (via the -D_FILE_OFFSET_BITS
macro). This means that if std is advertises these raw
types as being "FFI
compatible with libc", it's not actually correct in all circumstances!&stat
from
&fs::Metadata
) makes it difficult to change the
implementation over time. Today the 32-bit Linux standard library doesn't
use LFS functions, so files over 4GB cannot be opened. Changing
this, however, would involve changing the stat
structure and may be difficult to do.raw
module attempt to return the libc
aliased type
on all platforms, for example DirEntryExt::ino
returns a type of
ino_t
. The ino_t
type is billed as being FFI compatible with the libc
ino_t
type, but not all platforms store the d_ino
field in dirent
with
the ino_t
type. For example on Android the definition of
ino_t
is u32
but the actual stored value is
u64
. This means that on Android we're actually silently
truncating the return value!Over time it's basically turned out that exporting the somewhat-messy details of
libc has gotten a little messy in the standard library as well. Exporting this
functionality (e.g. being able to access all of the fields), is quite useful
however! This RFC proposes tweaking the design of the extensions in
std::os::*::raw
to allow the same level of information exposure that happens
today but also cut some of the tie from libc to std to give us more freedom to
change these implementation details and work around weird platforms.
First, the types and type aliases in std::os::*::raw
will all be
deprecated. For example stat
, ino_t
, dev_t
, mode_t
, etc, will all be
deprecated (in favor of their definitions in the libc
crate). Note that the C
integer types, c_int
and friends, will not be deprecated.
Next, all existing extension traits will cease to return platform specific type
aliases (such as the DirEntryExt::ino
function). Instead they will return
u64
across the board unless it's 100% known for sure that fewer bits will
suffice. This will improve consistency across platforms as well as avoid
truncation problems such as those Android is experiencing. Furthermore this
frees std from dealing with any odd FFI compatibility issues, punting that to
the libc crate itself it the values are handed back into C.
The std::os::*::fs::MetadataExt
will have its as_raw_stat
method deprecated,
and it will instead grow functions to access all the associated fields of the
underlying stat
structure. This means that there will now be a
trait-per-platform to expose all this information. Also note that all the
methods will likely return u64
in accordance with the above modification.
With these modifications to what std::os::*::raw
includes and how it's
defined, it should be easy to tweak existing implementations and ensure values
are transmitted in a lossless fashion. The changes, however, are both breaking
changes and don't immediately enable fixing bugs like using LFS on Linux:
let a: ino_t = entry.ino()
would break as the ino()
function
will return u64
, but the definition of ino_t
may not be u64
for all
platforms.stat
structure itself on 32-bit Linux still uses 32-bit fields (e.g. it
doesn't mirror stat64
in libc).To help with these issues, more extensive modifications can be made to the
platform specific modules. All type aliases can be switched over to u64
and
the stat
structure could simply be redefined to stat64
on Linux (minus
keeping the same name). This would, however, explicitly mean that
std::os::raw is no longer FFI compatible with C.
This breakage can be clearly indicated in the deprecation messages, however.
Additionally, this fits within std's breaking changes policy as
a local as
cast should be all that's needed to patch code that breaks to
straddle versions of Rust.
As mentioned above, this RFC is strictly-speaking a breaking change. It is expected that not much code will break, but currently there is no data supporting this.
Returning u64
across the board could be confusing in some circumstances as it
may wildly differ both in terms of signedness as well as size from the
underlying C type. Converting it back to the appropriate type runs the risk of
being onerous, but accessing these raw fields in theory happens quite rarely as
std should primarily be exporting cross-platform accessors for the various
fields here and there.
The documentation of the raw modules in std could be modified to indicate that
the types contained within are intentionally not FFI compatible, and the same
structure could be preserved today with the types all being rewritten to what
they would be anyway if this RFC were implemented. For example ino_t
on
Android would change to u64
and stat
on 32-bit Linux would change to
stat64
. In doing this, however, it's not clear why we'd keep around all the
C namings and structure.
Instead of breaking existing functionality, new accessors and types could be
added to acquire the "lossless" version of a type. For example we could add a
ino64
function on DirEntryExt
which returns a u64
, and for stat
we
could add as_raw_stat64
. This would, however, force Metadata
to store two
different stat
structures, and the breakage in practice this will cause may
be small enough to not warrant these great lengths.
u64
too strict? Should types like
mode_t
be allowed as i32
explicitly? Should the sign at least attempt to
always be preserved?