/ Mozilla

concat_idents! and macros in ident position

In the next few blog posts I want to cover some areas for improvement for Rust macros. I'll drill down into one topic at a time. I'll propose solutions, but not in RFC-levels of detail. I hope to get lots of early feedback this way, rather than doing all the design myself and having a 'big-bang' RFC. A lot of the macro issues are inter-related, so some of the details in early areas won't get fleshed out until later.

I'm going to start with concatenating idents in macros and using macros as idents. This is a fairly small thing, but is an irritating thing for many macro authors. The problem space has been expored already, see the references section at the end for links to issues and a previous RFC.

Motivation

It is often desired to merge two (or more) identifiers to make a new one in a macro. For example, a macro might take an identifier and create two functions - e.g., foo and foo_mut from foo. This is related to the issue of creating new identifiers in macros.

Currently we support this by using the concat_idents macro. There are two main problems with it - it is not very smart with hygiene, and it is not very useful since macros can't be used in ident position; so although it can synthesis idents, these can only be used as expressions. E.g., although we could access variables called foo and foo_mut given foo, we can't generate functions with these names or even call functions with those names.

The hygiene issue is that concat_idents creates a fresh ident. Essentially, the ident is only in the scope of the macro which created it. Generally, you want to inherit the hygiene context from one of the source idents.

Proposed solution

  • deprecate concat_idents! - without proper hygiene support, it is not suitable for stable use,

  • allow macros in ident position (see below for details),

  • provide libraries for manipulating hygiene contexts in procedural macros (see later blog posts),

  • provide a library of procedural macros which can concatenate identifiers with various different kinds of hygiene results (see below for details).

Macros in ident position

The concept is pretty simple - anywhere we accept an identifier in current Rust, we would accept a macro. This includes (but is not limited to): variable declarations, paths, field expressions, method and function calls, item names (functions, structs, etc.), field names, type names, type variables (declaration and use), and so forth. I propose that this facility is only available for 'new' macros, partly for technical reasons (see below, hygiene and types) and partly as a carrot to move people to the new macro system.

Examples, if foo is a macro: x.foo!() (field access), x.foo!(bar)() (method call), fn foo!(baz)(x: T) { ... } (function declaration).

If a macro occurs in expression or pattern position, then there is the question of whether it should be parsed as a macro in expression position or a macro in ident position as a path which is an expression (similarly with patterns). I think we can always choose the least specific option (i.e., expression rather than ident position) because we parse the macro body after expansion, so if it is an ident, the parser will wrap it to make an expression (if we assumed it was an ident, but it was a non-ident expression, we would be stuck). This only works for procedural macros if their output is tokens rather than AST (see later blog posts).

I believe parsing should not be too affected - after parsing an ident, we simply need to check for an !. Any AST node that contains an Ident must instead contain a new enum which can either be an Ident or a macro in ident position.

I don't believe there are issues with expansion or hygiene, expansion should work the same as for macros in other position and hygiene should "just work". One might imagine that since idents are the target of the hygiene algorithm, that creating new ones with macros would be difficult. But I don't think this is the case - the macro must expand to an ident only (or further macros that expand to an ident), and the hygiene context for that ident will be due to the macro which creates it and the expansion itself (see note on sets of scopes algorithm below). We might need to adjust the hygiene context of the ident in the macro where it is created, but that is a separate issue, see below. Note that in nearly all cases, users of a macro that produce an ident will need to pass some context to the macro to make the produced ident accessible.

There are some undecided questions where macros supply idents in items, types, and other places where we don't currently support hygiene. I propose to only support macros in ident position with new macros, which should be hygienic where current macros are not. This should make things easier. Exactly how macros in ident position interact with item hygiene is an open question until we nail down exactly how item hygiene should work.

Note: sets of scopes

I have been thinking of changing our hygiene algorithm to the sets of scopes algorithm. I won't go into the details here, but I think it will help with a lot of issues. It should mostly be simpler than the mtwt algorithm, but one area where it will add complexity is with use-site scopes. These are added to the sets of scopes in order to handle recursive macros, but when a macro contributes to a new binding (I believe this will mean macros in pattern position and macros in ident position where the ident introduces a binding), then we must be careful not to add use-site scopes. This point needs more consideration, but I think it will be OK - we just have to be careful about these scopes.

Drawbacks and alternatives

I think macros in ident position look ugly - having double sets of parentheses (in function calls) or parentheses where there wouldn't normally be is confusing to read. It also makes code harder to read in general, since names are an essential way we link parts of code together. Having names be macro generated makes code harder to make sense of. It's also confusing for tools.

One alternative would be to only allow macros in ident position inside macro definitions - this should address most use cases, without making general Rust code harder to read (macros are already harder to read, so there is less of an impact). I think I favour this alternative, although I am very keen to get others' opinions on it.

Another alternative would be to come up with some new syntax especially for use in ident position - this might be less ugly and give more of a hint of the generated name. However, since we must be able to pass arguments to macros, I'm not optimistic that this is possible. Furthermore, it is more syntax and thus a bigger language which is confusing in its own way.

Library macros for concatenation

This section will be a bit hand-wavey for now, we need to decide on the fundamental APIs for allowing procedural macros to interact with the hygiene system before we can settle the details.

I expect we want a fundamental create_ident! macro (or create_ident function) which takes a string name and a syntax context (probably some token which carries it's hygiene information, more on exactly how this works later). E.g., create_ident!(foo, "bar") would create an ident with the name bar and a hygiene context taken from foo.

We would also have fresh_ident! which would create an ident from a string name with a fresh hygiene context (similar to gensym'ing) and new_ident! which does the same thing but with an empty hygiene context (i.e., it gets only the context due to the expansion of the macro where it is created). The difference between the two being that two idents created with fresh_ident! would have different contexts, but two created with empty_ident! would have the same contexts.

We then provide convenience macros which take a list of idents and/or an ident (for its hygiene context) and a list of things which produce strings, and produce a new ident with either hygiene contexts taken from the first ident, or a seperately specified object, or a fresh context. Obviously, we need to make this a bit more concrete.

Example

struct Foo {
    a: A,
    b: B,
}

macro! def_getters {
    ($f: ident, $t: ident) => {
        fn concat_idents_with!($f, "get_", $f)(&self) -> &$t {
            &self.$f
        }
        fn concat_idents_with!($f, "get_", $f, "_mut")(&mut self) -> &mut $t {
            &mut self.$f
        }
    }
}

impl Foo {
    def_getters!(a, A);
    def_getters!(b, B);
}

fn main() {
    let f = Foo { ... };
    println!("{}", f.create_ident!(f, "get_a")());
}

Where concat_idents_with!($f, "get_", $f, "_mut") expands to an ident with name get_$f_mut and hygiene context taken from $f. Note that in this case concat_idents_with! is used in a binding context, so the hygiene context (under a set of scopes model) should not include a use-site scope.

The use of create_ident in main is a bit silly, it's just for demonstration purposes: f.create_ident!(f, "get_a")() has exactly the same meaning as writing f.get_a().

References

concat_idents tracking issue

macros in ident position issue

macros in ident position RFC