22 March 2021

RFC Reading - #2996 - async stream trait

RFCs are how changes are made to Rust (language, libraries, core tools, processes and governance, etc.). If you want to keep abreast of changes to Rust or want to deeply understand a feature of Rust, reading RFCs is a must. However, doing so can be a bit of a chore - RFCs can be pretty dense, the discussion can be long (and unfortunately sometimes heated), and the nature of GitHub PRs doesn't help (ordering discussion chronologically rather than by topic, mixing significant discussion with administrative details, etc.). So, I thought I would try blogging about RFCs in the same way I occasionally blog about research papers. I don't expect this to be regular or frequent, but hopefully it will be more than a one-off.

I'll try to objectively summarise the RFC and discussion. I'll add in some context from my own experience/research; I'll try to be objective here too but there will be some unavoidable bias. Where I express my own opinion, I'll try to always use the first person ("I believe...", "I think...", etc.).

In this post I'm reading about the (async) Stream trait, an RFC that has been de facto accepted after a somewhat long discussion.

Summary

Rendered RFC
Discussion
Author: Nell Shamrell
Date submitted: 2020-09-30
Current status: FCP complete, waiting to be merged

Introduce the Stream trait into the standard library, using the design from futures. Redirect the Stream trait definition in the futures-core crate (which is "pub-used" by the futures crate) to the standard library.

Background

A stream is a common concept in computer science. In the context of this RFC, it refers to an asynchronous iterator. In other words, a sequence of values which will be available at some point in the future, rather than right now (as in regular iterators). There is a Stream trait in std (unstable) and a good description of streams in the docs.

Streams are an extension to futures, and the futures libraries situation is a bit complex. There is a futures crate and also some futures modules in the standard library. Futures started off in their own crate, and after much experimentation, parts of that crate are being moved to the standard library. For backward compatibility, the moved items are re-exported (pub useed) in the futures crate.

Contribution

This RFC proposes moving the Stream trait from the futures crate to the standard library, as has been done for the Future trait in the past. There are no changes to the trait proposed. The move itself has already taken place and you can import the new Stream trait on nightly from std::stream::Stream (docs).

Most of the API of streams is actually in the StreamExt trait, and that is not proposed to be moved. So for most users of streams, you'll still need to use the futures crate.

More from the RFC

There are already users of the Stream trait; the RFC has links.

The guide-level explanation is similar to the stream docs in the futures crate. The trait is small and has a single required method: poll_next, it is a cross between Future::poll and Iterator::next. Calling poll_next will reveal if there is a value from the stream which is ready, the stream is finished, or the next value is pending. Similar to futures, the stream must use the poll_next method to drive completion of the stream, and arrange for the stream's task to awake when a value is ready.

The RFC specifies four utility impls for the Stream trait.

An alternative presented in the RFC is to have an async next function, rather than a poll_next. However, since async methods in traits are unsupported, this alternative is not explored in depth.

Discussion

It was pointed out that there is a backward compatibility issue for the futures crate. If a user uses a new standard library and an old futures crate, there will be two incompatible Stream traits in their program.

There is some discussion of where exactly the trait should live. std::stream::Stream was settled on, std::future::Stream and std::future::stream::Stream were also suggested, but didn't get much traction. (I'm simplifying somewhat - for all suggestions the trait actually lives at core::... and is re-exported at std::...).

There is an interesting comment about send/sync streams, but I don't understand why it is true and what the effects would be. It didn't get much discussion.

There is an interesting digression (starting with this comment) about knowing when a stream is ready to be polled. AIUI, this comes down to extending Rust's polling model of async computation with some way to be notified that a stream is ready. An example application is where there are many streams of which only a few will be ready. In this case, the controller could poll just those streams that are likely to be ready, rather than wasting effort polling many pending streams. This led to an interesting (to me) question about whether channels should be abstracted as streams.

There was some debate about whether this RFC should do more. I was glad to see quite a lot of energy pushing to keep the RFC minimal. RFCs can take a really long time to be accepted, that is good because we want to have detailed discussion to avoid making mistakes - due to Rust's stability promise, mistakes are forever. However, that creates pressure to make sure RFCs are complete (or at least extensive) so as not to have to go through the process multiple times, and on the other hand, to keep RFCs small to land in a vaguely reasonable time frame. I believe small and incremental is good for RFCs. But, as others have noted, we must find ways to make the RFC process faster and less stressful. This RFC is a good example - it is tiny and fairly simple (relative to most RFCs), the motivation is strong and aligned with Rust's roadmap, and we have years of experience and polishing of the proposed features. And it still took five months and 164 comments to land it.

Anyway, most of the suggested extensions are included in the future work section of the RFC, and I'll go over that below.

Future work

The RFC describes a lot of future work.

Probably the next thing to work on is the next method. This already exists on the StreamExt trait in the futures crate. It is just like next on an iterator, but it returns a future, rather than a value. Together with .await, next permits ergonomic use of streams, including in while let loops. However, there are some open questions with next, importantly that the current implementation is not object-safe.

The most fun future work, and probably the furthest away, is syntax for working with streams. I think the ergonomics gap between futures using async/await and streams is pretty huge at the moment. When I am working with async code, then have to use a stream, I feel like I'm stepping back in time. Making for loops work with streams is the obvious choice. As the RFC points out, there are a couple of wrinkles: first, the for loop must pin the stream for you. A more human consideration is the difference between iterating sequentially (the obvious desugaring of a for loop) or concurrently (probably what the user wants when using async streams).

The other side of the syntax coin is yield syntax for writing generators. Generators are already used internally for implementing async/await, but are not a user-facing feature (yet). Generators could be both sync and async for creating iterators and streams, respectively. The RFC discusses some hurdles: working with lending streams (see below), pinning, and lifetime requirements. (IMO, a lot of the problems with streams comes from wanting them to work just like futures and just like iterators. We'll need to decide on which has priority, or decide on a set of principles specifically for streams, otherwise we'll always be in tension between the two analogies).

The Stream trait itself is fairly minimal. In the futures crate, there are a bunch of useful combinator-style methods on the StreamExt trait. These are not migrated as part of this RFC, but left for future work. The RFC states that this is mostly due to async closures being unstable (and having outstanding design issues). The futures crate also includes TryStream and TryStreamExt traits for conveniently working with streams which return Results, these are also not migrated to std and are not mentioned in the RFC.

A lending stream is a stream which returns values which are borrowed from the stream itself (rather than owned values). The current status quo for a lending stream requires GATs, which are unstable. There is also a discussion to be had around converting from lending to non-lending streams and vice-versa.

Talking of converting things to streams, there is also some discussion of IntoStream/FromStream traits (c.f., IntoIterator). There is also the specific case of converting an iterator into a stream.

Finally, there are a bunch of helper traits for iterators (e.g., DoubleEndedIterator) which could be implemented for streams.