RFC 3123: Rustdoc: scrape code examples

tools (rustdoc)

Summary

This RFC proposes an extension to Rustdoc that automatically scrapes code examples from the project's examples/ directory.

Check out a live demo here: https://willcrichton.net/example-analyzer/warp/trait.Filter.html#method.and

Motivation

Code examples are an important tool for programmers to understand how a library works. Examples are concrete and contextual: they reference actual values rather than abstract parameters, and they show how to use a function in the context of the code around it.

As a parable of the value of examples, I recently did a small user study where I observed two Rust programmers learning to use the warp library for a basic task. Warp is designed around a generic Filter abstraction. Both participants found the documentation for Filter methods to be both imposing and too generic to be useful. For instance, Filter::and:

Rustdoc documentation for Filter::and in the warp crate

The repo authors also included a code example. But neither participant could understand the example because it lacked context.

Example code for Filter::and

The participant who was less familiar with Rust struggled to read the documentation and failed to accomplish the task. By contrast, the participant who was more familiar with Rust knew to look in the examples/ directory, where they found a wealth of examples for each function that complemented the documentation. For instance, rejection.rs shows the usage of and in combination with map:

let math = warp::path!("math" / u16);
let div_with_header = math
    .and(warp::get())
    .and(div_by())
    .map(|num: u16, denom: NonZeroU16| {
        warp::reply::json(&Math {
            op: format!("{} / {}", num, denom),
            output: num / denom.get(),
        })
    });

The goal of this RFC is to bridge the gap between automatically generated documentation and code examples by helping users find relevant examples within Rustdoc.

Guide-level explanation

The scrape-examples feature of Rustdoc finds examples of code where a particular function is called. For example, if we are documenting Filter::and, and a file examples/returning.rs contains a call to and, then the corresponding Rustdoc documentation looks like this:

UI for scraped examples shown with Filter::and

After the user-provided documentation in the doc-comment, scrape-examples inserts a code example (if one exists). The code example shows a window into the source file with the function call highlighted in yellow. The icons in the top-right of the code viewer allow the user to expand the code sample to the full file, or to navigate through other calls in the same file. The link above the example goes to the full listing in Rustdoc's generated src/ directory, similar to other [src] links.

Additionally, the user can click "More examples" to see every example from the examples/ directory, like this:

Additional examples are shown indented under the main example

To use the scrape-examples feature, simply add the --scrape-examples flag like so:

cargo doc --scrape-examples

Reference-level explanation

I have implemented a prototype of the scrape-examples feature as modifications to rustdoc and cargo. You can check out the draft PRs:

The feature uses the following high-level flow, with some added technical details as necessary.

  1. The user gives --scrape-examples as an argument to cargo doc.
  2. Cargo runs the equivalent of cargo rustdoc --examples (source).
    • Specifically, when constructing the BuildContext, Cargo will now recursively invoke rustdoc on all files matching the --examples filter.
    • Each invocation includes a flag --scrape-examples <output path> which directs rustdoc to output to a file at the specific location.
  3. An instance of rustdoc runs for each example, finding all call-sites and exporting them to a JSON file (source).
    • A visitor runs over the HIR to find call sites that resolve to a specific linkable function.
    • As a part of this pass, rustdoc also generates source files for the examples, e.g. target/doc/src/example/foo.rs. These are then linked to during rendering.
    • The format of the generated JSON is {function: {file: {locations: [list of spans], other metadata}}}. See the AllCallLocations type.
  4. Rustdoc is then invoked as normal for the package being documented, except with the added flags --with-examples <path/to/json> for each generated JSON file. Rustdoc reads the JSON data from disk and stores them in RenderOptions.
  5. Rustdoc renders the call locations into the HTML (source).
    • This involves reading the source file from disk to embed the example into the page.
  6. Rustdoc's Javascript adds interactivity to the examples when loaded (source).
    • Most of the logic here is to extend the code viewer with additional features like toggling between snippet / full file, navigating between call sites, and highlighting code in-situ.

The primary use case for this will be on docs.rs. My expectation is that docs.rs would use the --scrape-examples flag, and all docs hosted there would have the scraped examples.

Drawbacks

  1. I think the biggest drawback of this feature is that it adds further complexity to the Rustdoc interface. Rustdoc already includes a lot of information, and a concern is that this feature would overload users, especially Rust novices.
  2. This feature requires pre-loading a significant amount of information into the HTML pages. If we want to keep the "view whole file" feature, then the entire source code of every referenced example would be embedded into every page. This will increase the size of the generated files and hence increase page load times.
  3. This feature requires adding more functionality to both Cargo and Rustdoc, increasing the complexity of both tools.

Rationale and alternatives

See "Unresolved questions" for more discussion of the design space.

Prior art

I have never seen a documentation generator with this exact feature before. There has been some HCI research like Jadeite and Apatite that use code examples to augment generated documentation, e.g. by sorting methods in order of usage. Other research prototypes have clustered code examples to show broad usage patterns, e.g. Examplore.

Unresolved questions

The main unresolved questions are about the UI: what is the best UI to show the examples inline? My prototype represents my best effort at a draft, but I'm open to suggestions. For example:

  1. Is showing 1 example by default the best idea? Or should scraped examples be hidden by default?
  2. Is the ability to see the full file context worth the increase in page size?
  3. How should the examples be ordered? Is there a way to determine the "best" examples to show first?

Future possibilities

To my mind, the main future extensions of this feature are:

  1. More examples: examples can be scraped from the codebase itself (e.g. this would be very useful for developers on large code bases like rustc ), or scraped from the ecosystem at large.
  2. Ordering examples: with more examples comes the question of how to present them all to the user. If there are too many examples, say >10, there should be a way to maximize the diversity of the examples (or something like that).