RFC 0640: Improve Debug trait (adds :#?)

libs (debugging | traits)

Summary

The Debug trait is intended to be implemented by every type and display useful runtime information to help with debugging. This RFC proposes two additions to the fmt API, one of which aids implementors of Debug, and one which aids consumers of the output of Debug. Specifically, the # format specifier modifier will cause Debug output to be "pretty printed", and some utility builder types will be added to the std::fmt module to make it easier to implement Debug manually.

Motivation

Pretty printing

The conventions for Debug format state that output should resemble Rust struct syntax, without added line breaks. This can make output difficult to read in the presense of complex and deeply nested structures:

HashMap { "foo": ComplexType { thing: Some(BufferedReader { reader: FileStream { path: "/home/sfackler/rust/README.md", mode: R }, buffer: 1013/65536 }), other_thing: 100 }, "bar": ComplexType { thing: Some(BufferedReader { reader: FileStream { path: "/tmp/foobar", mode: R }, buffer: 0/65536 }), other_thing: 0 } }

This can be made more readable by adding appropriate indentation:

HashMap {
    "foo": ComplexType {
        thing: Some(
            BufferedReader {
                reader: FileStream {
                    path: "/home/sfackler/rust/README.md",
                    mode: R
                },
                buffer: 1013/65536
            }
        ),
        other_thing: 100
    },
    "bar": ComplexType {
        thing: Some(
            BufferedReader {
                reader: FileStream {
                    path: "/tmp/foobar",
                    mode: R
                },
                buffer: 0/65536
            }
        ),
        other_thing: 0
    }
}

However, we wouldn't want this "pretty printed" version to be used by default, since it's significantly more verbose.

Helper types

For many Rust types, a Debug implementation can be automatically generated by #[derive(Debug)]. However, many encapsulated types cannot use the derived implementation. For example, the types in std::io::buffered all have manual Debug impls. They all maintain a byte buffer that is both extremely large (64k by default) and full of uninitialized memory. Printing it in the Debug impl would be a terrible idea. Instead, the implementation prints the size of the buffer as well as how much data is in it at the moment: https://github.com/rust-lang/rust/blob/0aec4db1c09574da2f30e3844de6d252d79d4939/src/libstd/io/buffered.rs#L48-L60

pub struct BufferedStream<S> {
    inner: BufferedReader<InternalBufferedWriter<S>>
}

impl<S> fmt::Debug for BufferedStream<S> where S: fmt::Debug {
    fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result {
        let reader = &self.inner;
        let writer = &self.inner.inner.0;
        write!(fmt, "BufferedStream {{ stream: {:?}, write_buffer: {}/{}, read_buffer: {}/{} }}",
               writer.inner,
               writer.pos, writer.buf.len(),
               reader.cap - reader.pos, reader.buf.len())
    }
}

A purely manual implementation is tedious to write and error prone. These difficulties become even more pronounced with the introduction of the "pretty printed" format described above. If Debug is too painful to manually implement, developers of libraries will create poor implementations or omit them entirely. Some simple structures to automatically create the correct output format can significantly help ease these implementations:

impl<S> fmt::Debug for BufferedStream<S> where S: fmt::Debug {
    fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result {
        let reader = &self.inner;
        let writer = &self.inner.inner.0;
        fmt.debug_struct("BufferedStream")
            .field("stream", writer.inner)
            .field("write_buffer", &format_args!("{}/{}", writer.pos, writer.buf.len()))
            .field("read_buffer", &format_args!("{}/{}", reader.cap - reader.pos, reader.buf.len()))
            .finish()
    }
}

Detailed design

Pretty printing

The # modifier (e.g. {:#?}) will be interpreted by Debug implementations as a request for "pretty printed" output:

[
    "a",
    "b",
    "c"
]
HashSet {
    "a",
    "b",
    "c"
}
HashMap {
    "a": 1,
    "b": 2,
    "c": 3
}
Foo {
    field1: "hi",
    field2: 10,
    field3: false
}
Foo(
    "hi",
    10,
    false
)

In all cases, pretty printed and non-pretty printed output should differ only in the addition of newlines and whitespace.

Helper types

Types will be added to std::fmt corresponding to each of the common Debug output formats. They will provide a builder-like API to create correctly formatted output, respecting the # flag as needed. A full implementation can be found at https://gist.github.com/sfackler/6d6610c5d9e271146d11. (Note that there's a lot of almost-but-not-quite duplicated code in the various impls. It can probably be cleaned up a bit). For convenience, methods will be added to Formatter which create them. An example of use of the debug_struct method is shown in the Motivation section. In addition, the padded method returns a type implementing fmt::Writer that pads input passed to it. This is used inside of the other builders, but is provided here for use by Debug implementations that require formats not provided with the other helpers.

impl Formatter {
    pub fn debug_struct<'a>(&'a mut self, name: &str) -> DebugStruct<'a> { ... }
    pub fn debug_tuple<'a>(&'a mut self, name: &str) -> DebugTuple<'a> { ... }
    pub fn debug_set<'a>(&'a mut self, name: &str) -> DebugSet<'a> { ... }
    pub fn debug_map<'a>(&'a mut self, name: &str) -> DebugMap<'a> { ... }

    pub fn padded<'a>(&'a mut self) -> PaddedWriter<'a> { ... }
}

Drawbacks

The use of the # modifier adds complexity to Debug implementations.

The builder types are adding extra #[stable] surface area to the standard library that will have to be maintained.

Alternatives

We could take the helper structs alone without the pretty printing format. They're still useful even if a library author doesn't have to worry about the second format.

Unresolved questions

The indentation level is currently hardcoded to 4 spaces. We could allow that to be configured as well by using the width or precision specifiers, for example, {:2#?} would pretty print with a 2-space indent. It's not totally clear to me that this provides enough value to justify the extra complexity.