libs (fmt)
Stabilize the std::fmt
module, in addition to the related macros and
formatting language syntax. As a high-level summary:
This RFC is primarily motivated by the need to stabilize std::fmt
. In the past
stabilization has not required RFCs, but the changes envisioned for this module
are far-reaching and modify some parts of the language (format syntax), leading
to the conclusion that this stabilization effort required an RFC.
The std::fmt
module encompasses more than just the actual
structs/traits/functions/etc defined within it, but also a number of macros and
the formatting language syntax for describing format strings. Each of these
features of the module will be described in turn.
The documented syntax will not be changing as-written. All of these features will be accepted wholesale (considered stable):
{}
for "format something here" placeholders{{
as an escape for {
(and vice-versa for }
)<
), center (^
), and right (>
).+
or -
)#
)0
){0}
){foo}
)While quite useful occasionally, there is no static guarantee that any implementation of a formatting trait actually respects the format specifiers passed in. For example, this code does not necessarily work as expected:
#[deriving(Show)]
struct A;
format!("{:10}", A);
All of the primitives for rust (strings, integers, etc) have implementations of
Show
which respect these formatting flags, but almost no other implementations
do (notably those generated via deriving
).
This RFC proposes stabilizing the formatting flags, despite this current state of affairs. There are in theory possible alternatives in which there is a static guarantee that a type does indeed respect format specifiers when one is provided, generating a compile-time error when a type doesn't respect a specifier. These alternatives, however, appear to be too heavyweight and are considered somewhat overkill.
In general it's trivial to respect format specifiers if an implementation
delegates to a primitive or somehow has a buffer of what's to be formatted. To
cover these two use cases, the Formatter
structure passed around has helper
methods to assist in formatting these situations. This is, however, quite rare
to fall into one of these two buckets, so the specifiers are largely ignored
(and the formatter is write!
-n to directly).
Currently Rust does not support named arguments anywhere except for format strings. Format strings can get away with it because they're all part of a macro invocation (unlike the rest of Rust syntax).
The worry for stabilizing a named argument syntax for the formatting language is that if Rust ever adopts named arguments with a different syntax, it would be quite odd having two systems.
The most recently proposed keyword argument
RFC used :
for the invocation
syntax rather than =
as formatting does today. Additionally, today foo = bar
is a valid expression, having a value of type ()
.
With these worries, there are one of two routes that could be pursued:
expr = expr
syntax could be disallowed on the language level. This
could happen both in a total fashion or just allowing the expression
appearing as a function argument. For both cases, this will probably be
considered a "wart" of Rust's grammar.foo = bar
syntax could be allowed in the macro with prior knowledge
that the default argument syntax for Rust, if one is ever developed, will
likely be different. This would mean that the foo = bar
syntax in
formatting macros will likely be considered a wart in the future.Given these two cases, the clear choice seems to be accepting a wart in the formatting macros themselves. It will likely be possible to extend the macro in the future to support whatever named argument syntax is developed as well, and the old syntax could be accepted for some time.
Today there are 16 formatting traits. Each trait represents a "type" of
formatting, corresponding to the [type]
production in the formatting syntax.
As a bit of history, the original intent was for each trait to declare what
specifier it used, allowing users to add more specifiers in newer crates. For
example the time
crate could provide the {:time}
formatting trait. This
design was seen as too complicated, however, so it was not landed. It does,
however, partly motivate why there is one trait per format specifier today.
The 16 formatting traits and their format specifiers are:
Show
d
⇒ Signed
i
⇒ Signed
u
⇒ Unsigned
b
⇒ Bool
c
⇒ Char
o
⇒ Octal
x
⇒ LowerHex
X
⇒ UpperHex
s
⇒ String
p
⇒ Pointer
t
⇒ Binary
f
⇒ Float
e
⇒ LowerExp
E
⇒ UpperExp
?
⇒ Poly
This RFC proposes removing the following traits:
Signed
Unsigned
Bool
Char
String
Float
Note that this RFC would like to remove Poly
, but that is covered by a
separate RFC.
Today by far the most common formatting trait is Show
, and over time the
usefulness of these formatting traits has been reduced. The traits this RFC
proposes to remove are only assertions that the type provided actually
implements the trait, there are few known implementations of the traits which
diverge on how they are implemented.
Additionally, there are a two of oddities inherited from ancient C:
d
and i
are wired to Signed
Binary
trait to use b
as its specifier.The remaining traits this RFC recommends leaving. The rationale for this is that they represent alternate representations of primitive types in general, and are also quite often expected when coming from other format syntaxes such as C/Python/Ruby/etc.
It would, of course, be possible to re-add any of these traits in a backwards-compatible fashion.
Binary
With the removal of the Bool
trait, this RFC recommends renaming the specifier
for Binary
to b
instead of t
.
A possible alternative to having many traits is to instead have one trait, such as:
pub trait Show {
fn fmt(...);
fn hex(...) { fmt(...) }
fn lower_hex(...) { fmt(...) }
...
}
There are a number of pros to this design:
&Show
and then if the format string supplied
:x
or :o
the runtime would simply delegate to the relevant trait method.There are also a number of cons to this design, which motivate this RFC recommending the remaining separation of these traits.
Show
trait becomes somewhat overwhelming because
it's no longer immediately clear which method should be overridden for what.Currently, each formatting trait has a signature as follows:
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result;
This implies that all formatting is considered to be a stream-oriented operation
where f
is a sink to write bytes to. The fmt::Result
type indicates that
some form of "write error" happened, but conveys no extra information.
This API has a number of oddities:
Formatter
has inherent write
and write_fmt
methods to be used
in conjunction with the write!
macro return an instance of fmt::Result
.Formatter
type also implements the std::io::Writer
trait in order to
be able to pass around a &mut Writer
.write_fmt
method to trump the Writer
's write_fmt
method in order to return an error
of the correct type.Result
return type is an enumeration with precisely one variant,
FormatError
.Overall, this signature seems to be appropriate in terms of "give me a sink of
bytes to write myself to, and let me return an error if one happens". Due to
this, this RFC recommends that all formatting traits be marked #[unstable]
.
There are a number of prelude macros which interact with the format syntax:
format_args
format_args_method
write
writeln
print
println
format
fail
assert
debug_assert
All of these are macro_rules!
-defined macros, except for format_args
and
format_args_method
.
All of these macros take some form of prefix, while the trailing suffix is
always some instantiation of the formatting syntax. The suffix portion is
recommended to be considered #[stable]
, and the sections below will discuss
each macro in detail with respect to its prefix and semantics.
The fundamental purpose of this macro is to generate a value of type
&fmt::Arguments
which represents a pending format computation. This structure
can then be passed at some point to the methods in std::fmt
to actually
perform the format.
The prefix of this macro is some "callable thing", be it a top-level function or
a closure. It cannot invoke a method because foo.bar
is not a "callable thing"
to call the bar
method on foo
.
Ideally, this macro would have no prefix, and would be callable like:
use std::fmt;
let args = format_args!("Hello {}!", "world");
let hello_world = fmt::format(args);
Unfortunately, without an implementation of RFC 31 this is not
possible. As a result, this RFC proposes a #[stable]
consideration of this
macro and its syntax.
The purpose of this macro is to solve the "call this method" case not covered
with the format_args
macro. This macro was introduced fairly late in the game
to solve the problem that &*trait_object
was not allowed. This is currently
allowed, however (due to DST).
This RFC proposes immediately removing this macro. The primary user of this
macro is write!
, meaning that the following code, which compiles today, would
need to be rewritten:
let mut output = std::io::stdout();
// note the lack of `&mut` in front
write!(output, "hello {}", "world");
The write!
macro would be redefined as:
macro_rules! write(
($dst:expr, $($arg:tt)*) => ({
let dst = &mut *$dst;
format_args!(|args| { dst.write_fmt(args) }, $($arg)*)
})
)
The purpose here is to borrow $dst
outside of the closure to ensure that the
closure doesn't borrow too many of its contents. Otherwise, code such as this
would be disallowed
write!(&mut my_struct.writer, "{}", my_struct.some_other_field);
These two macros take the prefix of "some pointer to a writer" as an argument,
and then format data into the write (returning whatever write_fmt
returns).
These macros were originally designed to require a &mut T
as the first
argument, but today, due to the usage of format_args_method
, they can take any
T
which responds to write_fmt
.
This RFC recommends marking these two macros #[stable]
with the modification
above (removing format_args_method
). The ln
suffix to writeln
will be
discussed shortly.
These two macros take no prefix, and semantically print to a task-local stdout stream. The purpose of a task-local stream is provide some form of buffering to make stdout printing at all performant.
This RFC recommends marking these two macros a #[stable]
.
ln
suffixThe name println
is one of the few locations in Rust where a short C-like
abbreviation is accepted rather than the more verbose, but clear, print_line
(for example). Due to the overwhelming precedent of other languages (even Java
uses println
!), this is seen as an acceptable special case to the rule.
This macro takes no prefix and returns a String
.
In ancient rust this macro was called its shorter name, fmt
. Additionally, the
name format
is somewhat inconsistent with the module name of fmt
. Despite
this, this RFC recommends considering this macro #[stable]
due to its
delegation to the format
method in the std::fmt
module, similar to how the
write!
macro delegates to the fmt::write
.
The format string portions of these macros are recommended to be considered as
#[stable]
as part of this RFC. The actual stability of the macros is not
considered as part of this RFC.
There are a number of freestanding
functions to consider in
the std::fmt
module for stabilization.
fn format(args: &Arguments) -> String
This RFC recommends #[experimental]
. This method is largely an
implementation detail of this module, and should instead be used via:
let args: &fmt::Arguments = ...;
format!("{}", args)
fn write(output: &mut FormatWriter, args: &Arguments) -> Result
This is somewhat surprising in that the argument to this function is not a
Writer
, but rather a FormatWriter
. This is technically speaking due to the
core/std separation and how this function is defined in core and Writer
is
defined in std.
This RFC recommends marking this function #[experimental]
as the
write_fmt
exists on Writer
to perform the corresponding operation.
Consequently we may wish to remove this function in favor of the write_fmt
method on FormatWriter
.
Ideally this method would be removed from the public API as it is just an
implementation detail of the write!
macro.
fn radix<T>(x: T, base: u8) -> RadixFmt<T, Radix>
This function is a bit of an odd-man-out in that it is a constructor, but does
not follow the existing conventions of Type::new
. The purpose of this
function is to expose the ability to format a number for any radix. The
default format specifiers :o
, :x
, and :t
are essentially shorthands for
this function, except that the format types have specialized implementations
per radix instead of a generic implementation.
This RFC proposes that this function be considered #[unstable]
as its
location and naming are a bit questionable, but the functionality is desired.
trait FormatWriter
This trait is currently the actual implementation strategy of formatting, and
is defined specially in libcore. It is rarely used outside of libcore. It is
recommended to be #[experimental]
.
There are possibilities in moving Reader
and Writer
to libcore with the
error type as an associated item, allowing the FormatWriter
trait to be
eliminated entirely. Due to this possibility, the trait will be experimental
for now as alternative solutions are explored.
struct Argument
, mod rt
, fn argument
, fn argumentstr
,
fn argumentuint
, Arguments::with_placeholders
, Arguments::new
These are implementation details of the Arguments
structure as well as the
expansion of the format_args!
macro. It's recommended to mark these as
#[experimental]
and #[doc(hidden)]
. Ideally there would be some form of
macro-based privacy hygiene which would allow these to be truly private, but
it will likely be the case that these simply become stable and we must live
with them forever.
struct Arguments
This is a representation of a "pending format string" which can be used to
safely execute a Formatter
over it. This RFC recommends #[stable]
.
struct Formatter
This instance is passed to all formatting trait methods and contains helper
methods for respecting formatting flags. This RFC recommends #[unstable]
.
This RFC also recommends deprecating all public fields in favor of accessor methods. This should help provide future extensibility as well as preventing unnecessary mutation in the future.
enum FormatError
This enumeration only has one instance, WriteError
. It is recommended to
make this a struct
instead and rename it to just Error
. The purpose of
this is to signal that an error has occurred as part of formatting, but it
does not provide a generic method to transmit any other information other than
"an error happened" to maintain the ergonomics of today's usage. It's strongly
recommended that implementations of Show
and friends are infallible and only
generate an error if the underlying Formatter
returns an error itself.
Radix
/RadixFmt
Like the radix
function, this RFC recommends #[unstable]
for both of these
pieces of functionality.
Today's macro system necessitates exporting many implementation details of the formatting system, which is unfortunate.
A number of alternatives were laid out in the detailed description for various aspects.