RFC 2504: fix-error

libs (traits | error-handling)

Summary

Change the std::error::Error trait to improve its usability. Introduce a backtrace module to the standard library to provide a standard interface for backtraces.

Motivation

The Error trait has long been known to be flawed in several respects. In particular:

  1. The required description method is limited, usually, to returning static strings, and has little utility that isn't adequately served by the required Display impl for the error type.
  2. The signature of the cause method does not allow the user to downcast the cause type, limiting the utility of that method unnecessarily.
  3. It provides no standard API for errors that contain backtraces (as some users' errors do) to expose their backtraces to end users.

We propose to fix this by deprecating the existing methods of Error and adding two new, provided methods. As a result, the undeprecated portion of the Error trait would look like this:

trait Error: Display + Debug {
    fn backtrace(&self) -> Option<&Backtrace> {
        None
    }

    fn source(&self) -> Option<&dyn Error + 'static> {
        None
    }
}

Guide-level explanation

The new API of the Error trait

The new API provides three main components:

  1. The Display and Debug impls, for printing error messages. Ideally, the Display API would be focused on end user messages, whereas the Debug impl would contain information relevant to the programmer.
  2. The backtrace method. If the Error contains a backtrace, it should be exposed through this method. Errors are not required to contain a backtrace, and are not expected to.
  3. The source method. This returns another Error type, which is the underlying source of this error. If this error has no underlying source (that is, it is the "root source" of the error), this method should return none.

The backtrace API

This RFC adds a new backtrace module to std, with one type, with this API:

pub struct Backtrace {
    // ...
}

impl Backtrace {
    // Capture the backtrace for the current stack if it is supported on this
    // platform.
    //
    // This will respect backtrace controlling environment variables.
    pub fn capture() -> Backtrace {
        // ...
    }

    // Capture the backtrace for the current stack if it is supported on this
    // platform.
    //
    // This will ignore backtrace controlling environment variables.
    pub fn force_capture() -> Backtrace {
        // ...
    }

    pub fn status(&self) -> BacktraceStatus {
        // ...
    }
}

impl Display for Backtrace {
    // ...
}

impl Debug for Backtrace {
    // ...
}

#[non_exhaustive]
pub enum BacktraceStatus {
    Unsupported,
    Disabled,
    Captured
}

This minimal initial API is just intended for printing backtraces for end users. In time, this may grow the ability to visit individual frames of the backtrace.

Backtrace controlling environment variables

Today, the RUST_BACKTRACE controls backtraces generated by panics. After this RFC, it also controls backtraces generated in the standard library: no backtrace will be generated when calling Backtrace::capture unless this variable is set. On the other hand, Backtrace::force_capture will ignore this variable.

Two additional variables will be added: RUST_PANIC_BACKTRACE and RUST_LIB_BACKTRACE: these will independently override the behavior of RUST_BACKTRACE for backtraces generated for panics and from the std API.

The transition plan

Deprecating both cause and description is a backward compatible change, and adding provided methods backtrace and source is also backward compatible. We can make these changes unceremoniously, and the Error trait will be much more functional.

We also change the default definition of cause, even though it is deprecated:

fn cause(&self) -> Option<&dyn Error> {
    self.source()
}

This way, if an error defines source, someone calling the deprecated cause API will still get the correct cause type, even though they can't downcast it.

Stability

The addition of source and the deprecation of cause will be instantly stabilized after implementing this RFC.

The addition of the backtrace method and the entire backtrace API will be left unstable under the backtrace feature for now.

Reference-level explanation

Why cause -> source

The problem with the existing cause API is that the error it returns is not 'static. This means it is not possible to downcast the error trait object, because downcasting can only be done on 'static trait objects (for soundness reasons).

Note on backtraces

The behavior of backtraces is somewhat platform specific, and on certain platforms backtraces may contain strange and inaccurate information. The backtraces provided by the standard library are designed for user display purposes only, and not guaranteed to provide a perfect representation of the program state, depending on the capabilities of the platform.

How this impacts failure

The failure crate defines a Fail trait with an API similar to (but not exactly like) the API proposed here. In a breaking change to failure, we would change that trait to be an extension trait to Error:

// Maybe rename to ErrorExt?
trait Fail: Error + Send + Sync + 'static {
    // various provided helper methods
}

impl<E: Error + Send + Sync + 'static> Fail for E {

}

Instead of providing a derive for Fail, failure would provide a derive for the std library Error trait, e.g.:

#[derive(Debug, Display, Error)]
#[display = "My display message."]
struct MyError {
    #[error(source)]
    underlying: io::Error,
    backtrace: Backtrace,
}

The exact nature of the new failure API would be determined by the maintainers of failure, it would not be proscribed by this RFC. This section is just to demonstrate that failure could still work using the std Error trait.

Drawbacks

This causes some churn, as users who are already using one of the deprecated methods will be encouraged (by warnings) to change their code, and library authors will need to revisit whether they should override one of the new methods.

Rationale and alternatives

Provide a new error trait

The most obvious alternative to this RFC would be to provide an entirely new error trait. This could make deeper changes to the API of the trait than we are making here. For example, we could take an approach like failure has, and impose stricter bounds on all implementations of the trait:

trait Fail: Display + Debug + Send + Sync + 'static {
    fn cause(&self) -> Option<&dyn Fail> {
        None
    }

    fn backtrace(&self) -> Option<&Backtrace> {
        None
    }
}

Doing this would allow us to assemble a more perfect error trait, rather than limiting us to the changes we can make backwards compatibly to the existing trait.

However, it would be much more disruptive to the ecosystem than changing the existing trait. We've already seen some friction with failure and other APIs (like serde's) that expect to receive something that implements Error. Right now, we reason that the churn is not worth slight improvements that wouldn't be compatible with the Error trait as it exists.

In the future, if these changes are not enough to resolve the warts with the Error trait, we could follow this alternative: we would deprecate the Error trait and introduce a new trait then. That is, accepting this RFC now does not foreclose on this alternative later.

Bikeshedding the name of source

The replacement for cause could have another name. The only one the RFC author came up with is origin.

Prior art

This proposal is largely informed by our experience with the existing Error trait API, and from the experience of the failure experiment. The main conclusions we drew:

  1. The current Error trait has serious flaws.
  2. The Fail trait in failure has a better API.
  3. Having multiple error traits in the ecosystem (e.g. both Error and Fail) can be very disruptive.

This RFC threads the needle between the problem with the Error trait and the problems caused by defining a new trait.

Unresolved questions

Backtrace API

This RFC intentionally proposes a most minimal API. There are a number of API extensions we could consider in the future. Prominent examples:

  1. Extending the backtrace API to allow programmatic iteration of backtrace frames and so on.
  2. Providing derives for traits like Display and Error in the standard library.
  3. Providing helper methods on Error that have been experimented with in failure, such as the causes iterator.

None of these are proposed as a part of this RFC, and would have a future RFC discussion.

Additionally, the choice to implement nullability internal to backtrace may prove to be a mistake: during the period when backtrace APIs are only available on nightly, we will gain more experience and possible change backtrace's constructors to return an Option<Backtrace> instead.