tools (cargo)
workspace-deduplicate
Deduplicate common dependency and metadata directives amongst a set of workspace
crates in Cargo with extensions to the [workspace]
section in Cargo.toml
.
Cargo has supported workspaces for quite some time now but when managing a large workspace there is often a good deal of redundancy between member crates in a workspace. Currently this proposal attempts to tackle a few major areas of duplication. Many of these areas of duplication are managed either manually or with scripts, and the goal of this proposal is to largely eliminate the need for scripts and also the need to manually manage so much.
[dependencies]
sectionsOften when managing a workspace you'll have a lot of crates that all depend on
the same crate. For example many of your crates may depend on log
. Today you
must write down the same log
directive in all your manifests:
[dependencies]
log = "0.3.1"
Depending on how many crates you're working on, that's a lot of times to
remember 0.3.1
! Additionally if you'd like to update this dependency, say if a
1.0.0
release is made, you need to edit every single Cargo.toml
to make sure
they all stay in sync. This is a lot of duplicated work!
This duplication gets even worse when you start modifying the features of each crate. For example:
[dependencies]
log = { version = "0.3.1", features = ['release_max_level_warn'] }
If you wanted to consistently write this across many crates it can get quite cumbersome quite quickly.
When managing a workspace you'll often have a lot workspace members that all depend on each other. The "blessed" way to do this is actually quite verbose:
[dependencies]
other-workspace-member = { path = "../other-member", version = "0.2.3" }
Here you need to specify both path
and version
. Using path
means that
you're depending on exactly that copy on the local filesystem. This also means
that if you depend on any workspace member via a git
dependency later on it'll
correctly pull in the other workspace members from the git repo. (note that some
projects use [patch]
to only write down other-workspace-member = "0.2.3"
but
this causes issues when crates later use git dependencies)
If you never publish to crates.io, path
is all you need. If crates eventually
get published, though, they also need a version
directive to know what version
from crates.io you'll be depending on after the publication.
Naturally, with a highly-interconnected workspace which may be relatively large,
this leads to a lot of duplication very quickly. This is a lot of path
and
version
directives that you've got to manage.
A frequent pattern in Cargo workspaces which publish to crates.io is to have all the crate at the same semver version. These crates all move in lockstep during publication and get bumped at the same time.
While a minor papercut this basically means that anyone and everyone who has a workspace of a lot of crates makes their own homebrew script for updating versions and managing updates/publications. It'd be quite convenient if we could standardize across the Rust ecosystem how to manage this information!
The last primary area of duplication that this proposal attempts to tackle is in
crate metadata in the [package]
section. This includes items such as:
[package]
authors = []
license = "..."
repository = "..."
documentation = "..."
These metadata directives are often duplicated amongst all crates, especially author/license/repository information. This is a pretty poor experience if you'e got to keep writing down the information in so many places!
Cargo's manifest parsing will be updated with new features to support deduplicating each of the areas above. While all of these new features are pretty small in their own right, they all add up to greatly reducing the overhead of managing a workspace of many crates. The list of new features in Cargo will look like the following:
The [workspace]
section can now have a dependencies
section which works the
same way as the [dependencies]
section in Cargo.toml
:
# in workspace's Cargo.toml
[workspace.dependencies]
log = "0.3.1"
log2 = { version = "2.0.0", package = "log" }
serde = { git = 'https://github.com/serde-rs/serde' }
wasm-bindgen-cli = { path = "crates/cli" }
Each workspace member can then reference this section in the workspace with a new dependency directive:
# in a workspace member's Cargo.toml
[dependencies]
log = { workspace = true }
This directive indicates that the log
dependency should be looked up from
workspace.dependencies
in the workspace root. You can reference any name
defined in [workspace.dependencies]
too:
[dependencies]
log2 = { workspace = true }
version
and path
to publish to Crates.ioWhen you have a path
dependency, Cargo's current behavior on publication looks
like this:
version
specifier as well, then the path
key is deleted and
the crate is uploaded with the specified version
as a dependency
requirement.version
specifier, then the dependency directive is
deleted and crates.io will not learn about this dependency. This is only
really useful for dev-dependencies
.Cargo's behavior will change in this second case, instead following new logic
for a missing version
specifier. For dev-dependencies where the referenced
package is publish = false
, then the dependency will be dropped. Otherwise
Cargo will assume that version = "$dependency_version"
was specified, meaning
that it requires at least the current version and otherwise any
semver-compatible version.
This behavior should mean that you no longer need to write version = "..."
with path
dependencies if you publish to crates.io. Coupled with the
workspace-level dependencies above this means you never have to write the
version of a path dependency anywhere!
To deduplicate [package]
directives in Cargo.toml
workspace members, Cargo
will now support declaring that metadata directives should be inherited from the
workspace. For example to version every package the same within a workspace you
can specify:
[package]
name = "foo"
version = { workspace = true }
This directive tells Cargo that the version of foo
is the same as the
workspace.version
directive found in the workspace manifest. This means that
in addition to a new [workspace.dependencies]
section, package metadata keys
can now also be defined inside of a [workspace]
section:
[workspace]
version = "0.25.2"
Many other package metadata attributes are supported as well
[package]
authors = { workspace = true }
license = { workspace = true }
Cargo's [workspace]
section will first be extended with a few new attributes.
Like before the [workspace]
table can only appear in a workspace root, not in
any other manifests. Additionally the [workspace]
table doesn't have to be
associated with a package, it could be part of a virtual manifest.
[workspace]
The first addition to the [workspace]
table is a dependencies
sub-table,
like so:
[workspace.dependencies]
foo = "0.1"
The dependencies
sub-table has the same form as the [dependencies]
table in
manifests with a few exceptions:
optional
. The optional
key must be
omitted or, if present, must be false
.workspace
key (defined later in this proposal) is not allowed.The [workspace]
table will not support other kinds of dependencies like
dev-dependencies
, build-dependencies
, or target."...".dependencies
. Only
[workspace.dependencies]
will be supported.
To review, the [workspace.dependencies]
table will be key/value pairs. Each
key is the name of a dependency while the dependency is a dependency directive.
This could be a string meaning a crates.io dependency or a table which further
configures the dependency.
Dependencies declared in [workspace.dependencies]
have no meaning as-is. They
do not affect the build nor do they force packages to depend on those
dependencies. This part comes later below.
The [workspace]
section will also allow the definition of a number of keys
also defined in [package]
today, namely:
[workspace]
version = "1.2.3"
authors = ["Nice Folks"]
description = "..."
documentation = "https://example.github.io/example"
readme = "README.md"
homepage = "https://example.com"
repository = "https://github.com/example/example"
license = "MIT"
license-file = "./LICENSE"
keywords = ["cli"]
categories = ["development-tools"]
publish = false
edition = "2018"
[workspace.badges]
# ...
Each of these keys have no meaning in a [workspace]
table yet, but will have
meaning when they're assigned to crates internally. That part comes later though
in this design! Note that the format and accepted values for these keys are the
same as the [package]
section of Cargo.toml
.
For now the metadata
key is explicitly left out (due to complications around
merging table values), but it can always be added in the future if necessary.
Cargo.toml
The interpretation of a Cargo.toml
manifest within Cargo will now require a
Workspace
object to be created. This Workspace
will be used to elaborate and
expand each member's Cargo.toml
directive. Additionally Cargo.toml
will
syntactically accept some more forms.
Previously package metadata values must be declared explicitly in each
Cargo.toml
:
[package]
version = "1.2.3"
Cargo will now accept a table definition of package.$key
which defines the
package.$key.workspace
key as a boolean. For example you can specify:
[package]
name = "foo"
license = { workspace = true }
This directive indicates that the license of foo
is the same as
workspace.license
. If workspace.license
isn't defined then this generates an
error.
The following keys in [package]
can be inherited from [workspace]
with the
new workspace = true
directive.
[package]
version = { workspace = true }
authors = { workspace = true }
description = { workspace = true }
documentation = { workspace = true }
readme = { workspace = true }
homepage = { workspace = true }
repository = { workspace = true }
license = { workspace = true }
license-file = { workspace = true }
keywords = { workspace = true }
categories = { workspace = true }
publish = { workspace = true }
Note that directives like license-file
are resolved relative to their
definition, so license-file
is relative to the [workspace]
section that
defined it.
Dependencies in the [dependencies]
, [dev-dependencies]
,
[build-dependencies]
, and [target."...".dependencies]
sections will support
the ability to reference the [workspace.dependencies]
definition of
dependencies. This is done with a new workspace
key in the dependency
directive. An example of this looks like:
[dependencies]
log = { workspace = true }
The workspace
key cannot be defined with other keys that configure the source
of the dependency. This means you cannot define workspace
with keys like
version
, registry
, registry-index
, path
, git
, branch
, tag
, rev
,
or package
. The workspace
key can be combined with other keys, however:
optional
- this introduces an optional dependency as usual, as well as a
feature named after the key (left hand side) of the dependency directive).
Note that the [workspace.dependencies]
table is not allowed to specify
optional
.
features
- this indicates, as usual, that extra features are being enabled
over the already-enabled features in the directive found in
[workspace.dependencies]
. The result set of enabled features is the union of
the features specified inline with the features specified in the directive in
the workspace table.
For now if a workspace = true
dependency is specified then also specifying the
default-features
value is disallowed. The default-features
value for a
directive is inherited from the [workspace.dependencies]
declaration, which
defaults to true
if nothing else is specified.
version
directiveAs a final change to Cargo.toml
, dependencies using the path
directive and
not specifying a version
directive will have the version
directive inferred.
For example if we have:
# foo/Cargo.toml
[dependencies]
bar = { path = "../bar" }
as well as
# bar/Cargo.toml
[package]
name = "bar"
version = "1.0.1"
this is equivalent in foo/Cargo.toml
to as if this were written:
# foo/Cargo.toml
[dependencies]
bar = { path = "../bar", version = "1.0.1" }
The version
key for path
dependencies, if not specified, will be inferred to
the version of the path dependency itself. Note that this is a version
requirement not an actual semver version, and the version requirement will be
interpreted as "at least the current version, and anything semver compatible
with it".
This logic of inferring, however, will also respect the publish
key. For
example if we had this instead:
# bar/Cargo.toml
[package]
name = "bar"
version = "1.0.1"
publish = false
then Cargo would not alter this dependency directive:
# foo/Cargo.toml
[dependencies]
bar = { path = "../bar" }
cargo publish
Cargo currently already "elaborates" the manifest during publication. For
example it removes path
keys in dependency lists to only have the version
requirement pointing to crates.io. During publication Cargo will also elaborate
any substituted information from the [workspace]
, because [workspace]
is
also removed during publication!
This means that workspace = true
will never be present in Cargo.toml
files
published to crates.io, and additionally no information about workspace = true
will make its way to the registry index. Furthermore metadata fields like
package.repository
will be filled in and will be present on crates.io's UI.
Put another way, Cargo.toml
files published to crates.io, or metadata found
through crates.io, won't change from what they are today.
Cargo.lock
When creating a Cargo.lock
file Cargo will perform crate resolution as-if all
dependencies in [workspace.dependencies]
are depended on by some crate, even
if no crate actually references an entry in [workspace.dependencies]
. This
means that if a crate uses an entry in [workspace.dependencies]
it's
guaranteed to have an entry in the lock file indicating what its dependencies
should be.
Note that for entries in [workspace.dependencies]
which aren't used by any
crates in the workspace will likely trigger a warning, however, so users can
continue to prune accidentally unused entries.
cargo metadata
Executing cargo metadata
to learn about a crate graph will implicitly perform
all subsitution defined in this proposal. Consumers of cargo metadata
will
continue to get the same output they got before this proposal, meaning that
implicit substitutions, if any, will be invisible to users of cargo metadata
.
cargo read-manifest
Similar to cargo metadata
, the cargo read-manifest
command will perform all
necessary subsitutions when presenting the output as JSON.
path
dependenciesLike today, path
dependencies will be resolved relative to the file that
defines them. This means that references to dependencies defined in the
workspace means paths are still relative to the workspace root itself.
For example if you write down a [workspace.depencencies]
directive with a
relative path:
# Cargo.toml
[workspace.dependencies]
my-crate = { path = "crates/my-crate" }
And then you reference this in another crate:
# crates/other-crate/Cargo.toml
[dependencies]
my-crate = { workspace = true }
then the my-crate
dependency references the crate located at crates/my-crate
relative to the workspace root, not located at
crates/other-crate/crates/my-crate
.
This proposal significantly complicates the process of interpreting a
Cargo.toml
. One of the major purposes of using TOML to specify a crate
manifest was to make it easy for other tools to parse Cargo manifests and work
with them. This not only includes Rust-based tools but also tools in other
languages if necessary. Previously a TOML parser for your language was all you
really needed, but this proposal is adding a layer of indirection on top of TOML
where you have to interpret multiple manifests to figure out what one means. For
example you can no longer quickly and easily be guaranteed to parse the version
of a package, but you might have to go find the workspace root or other crates
to figure that out. Workspace discovery and membership is pretty nontrivial so
non-Cargo based tools will have a difficult time not using Cargo to figure out
a full elaborated form of a manifest 100% of the time.
This proposal also extends Cargo.toml
with changes that will break any
existing tools which assume a particular format of Cargo.toml
. For example if
a tool expects package.version
to be a String
that runs a risk of being
broken in the future due to the ability to specify a table there instead.
Additionally this proposal complicates a reader's understanding of Cargo.toml
.
While verbose for maintainers having duplication of information is actually
quite nice for readers of Cargo.toml
because you don't have to chase anything
else down to figure out what a dependency is. If this proposal is implemented
then whenever you see foo = { workspace = true }
you've got to go consult
something else to figure out what the dependency actually is. This layer of
indirection can cause surprise for readers or otherwise add a speed-bump to
understanding the contents of a manifest.
Cargo's manifests have been a pretty carefully curated part of Cargo's design to ensure that they're consistently readable and concise where possible. For example many of Cargo's manifest idioms gently nudge users towards the same standards across the community by supporting many zero-configuration situations such as where to put and how to name tests.
This proposal is an extension of these design principles to provide a gentle nudge to consistently, across the Rust community, manage workspaces, dependencies, and metadata. A goal here is to increase consistency in how this is all managed across projects in a way that still preserves Cargo's existing flexibility for users.
Note that flexibility is a key part of this proposal where it's possible to intermingle shorthands with longer versions. For example if the workspace declares:
[workspace.dependencies]
log = "0.3"
But you really want to try out a new version of log
in one workspace member,
you can easily do so by changing
[dependencies]
log = { workspace = true }
to
[dependencies]
log = "0.4"
Additionally you can always custom-version your packages, you've just got the option to reference another package as well. Overall this proposal should empower more power users of Cargo to manage workspaces easily without taking away any of the existing configurations that Cargo already supports.
This proposal is largely a syntactic proposal for Cargo.toml
and changing how
we can specify a few directives. Naturally that lends itself to quite a lot of
possible bikeshedding! Virtually all of the aspects of the proposal that modify
Cargo.toml
can be tweaked in various ways such as names used or where they're
placed. In any case discussion about compelling alternatives is always
encouraged!
Some alternative syntaxes:
[dependencies]
# Instead of `foo = { workspace = true }`
foo = {}
foo = "ws"
foo = "workspace"
foo.workspace = true # technically the same, but idiomatically different
This proposal indicates that package metadata is not inherited by default from
the workspace. This may be desired in some scenarios instead of repeating
license = { workspace = true }
everywhere, and there's likely two possible
ways this could happen.
Workspace directives could be implicitly and automatically inherited to members. In the future, however, Cargo will want to support nested workspaces, and it's unclear how these features will interact. In order to strik a reasonable middle-ground for now a simple solution which should address many use cases is proposed and we can continue to refine this over time as necessary.
Directives could be flagged to be explicitly inherited to workspace members as an optional way of specifying this. For now though to keep this proposal simple this is left out as a possible future extension of Cargo.
One possible extension of this RFC is for metadata to not only be inheritable
from the [workspace]
table but also from other packages. For example a
scenario seen in the wild is that some repositories have multiple "cliques" of
crates which are all versioned as a unit. In this scenario one "clique" can have
its version directives deduplicated with this proposal, but not multiple ones.
It's hoped though that an eventual feature of nested workspaces would solve this issue in Cargo. That way each "clique" could correspond to one workspace, and that way we wouldn't need extra support to inherit directives from anywhere.
Duplication throughout workspaces has been a thorn in Cargo's since practically since the inception of workspaces. Naturally there's quite a few bugs filed on Cargo's issue tracker about this which provide some context for why make a proposal at all as well as how to design this proposal.
[patch]
which breaks
git dependencies[patch]
tables are used seemingly to make it easier
to specify dependencies in a workspace, but having everything in
[workspace.dependencies]
makes it smaller to specify.One sort of far-out-there alternative we could go for is to be far more ambitious and make our own sort of "templating language" on top of TOML. This would arguably be much more flexible than the limited amount of deduplication proposed here, but you could imagine things like:
[package]
name = "foo"
version = "1.{workspace.vars.minor}.0"
[dependencies]
bar = "{workspace.dependencies.bar}"
baz = { version = "1", features = "{workspace.vars.baz_features}" }
or "insert your own idea for how we can go all out" here. In general though I think there's a lot to be gained from the simplicity of TOML and prioritizing other tools reading Cargo manifests, so we may not want to go full-blown templating language just yet.
One thing we'll want to resolve for sure is nailing down all the syntactical decision here, which is expected to evolve through consensus.
It's not clear how complex an implementation of this proposal will be in Cargo. It could be prohibitively complex, but it's hoped that it's a relatively simple refactoring to implement this in Cargo.