27 October 2015

Macros in Rust pt1

In the last blog post I introduced macros and some general issues. In this post I want to describe macros (and other syntactic tools) in Rust. In the next instalment, I'll go into hygiene and implementation issues in more detail, and cover some areas for improvement in Rust.

Rust offers an array of macro-like features: macro_rules macros, procedural macros, and built-in macros (like println! and asm!, #[cfg], and #[derive()]) (I feel like I'm missing something). I was going to cover them all in this post, but it's already huge, so I'll just cover macro_rules, more later...

macro_rules!

macro_rules! lets you write syntactic macros based on pattern matching. They are often described as 'macros by example' since you write pretty much the Rust code you would expect them to expand to.

They are defined using the macro_rules! built-in macro, and are used in a function like style: foo!(a, b, c), the ! distinguishes a macro from a regular function. For example, a very simple (and pointless) macro:

macro_rules! hello {
    () => { println!("hello world"); }
}

fn main() {
    hello!();
}

This simple program defines a single macro called hello and calls it once from within the main function. After a round of expansion, this will turn into:

fn main() {
    println!("hello world");
}

This is dead simple, the only interesting thing being that we use a macro inside the macro definition. As the compiler further process the program, that macro (println!) will be expanded into regular Rust code.

The syntax of macro definitions looks a bit verbose at this stage, because we only have a single rule and we don't take any arguments, this is very rare in real life.

As in a match expression, => indicates matching a pattern (in this case ()) to a body (in this case println!(...);). Since our use of the macro matches the single pattern (hello!() has no arguments), we expand into the given body.

The next example has some arguments, but still a single rule for matching:

macro_rules! my_print {
    ($i: ident, $e: expr) => {
        let $i = {
            let a = $e;
            println!("{}", a);
            a
        };
    }
}

fn main() {
    my_print!(x, 30 + 12);
    println!("and again {}", x);
}

This is somewhat contrived, partly to demonstrate a few features at once and partly because some of the details of Rust's macro implementation are a little odd.

The macro this time takes two arguments. Macro arguments must be prefixed with a $, the formal arguments are $i and $e. The 'types' of macro arguments are different to proper Rust types. An ident is the name of an identifier (we pass in x), expr is an expression (we pass 30 + 12). When the macro is expanded, these values are substituted into the macro body. Note that (as opposed to function evaluation in Rust), expressions are not evaluated, so the compiler actually does substitute 30 + 12, not 42.

Note what is happening with x in the example: the macro declares a new variable called x and we can refer to it outside the macro. This is hygienic (technically, you could have an interesting argument about hygiene here, but lets not) because we pass x in to the macro, so it 'belongs' to the caller's scope. If instead we had used let x = { ... in the macro, referring to x would be an error, because the x belongs to the scope of the macro.

The last example introduces proper pattern matching:

macro_rules! my_print {
    (foo <> $e: expr) => { println!("FOOOOOOOOOOOOoooooooooooooooooooo!! {}", $e); };
    ($e: expr) => { println!("{}", $e); };
    ($i: ident, $e: expr) => {
        let $i = {
            let a = $e;
            println!("{}", a);
            a
        };
    };
}

fn main() {
    my_print!(x, 30 + 12);
    
    my_print!("hello!");

    my_print!(foo <> "hello!");
}

OK, the example is getting really silly now, but hopefully it is illustrative. We now have three patterns to match, in reverse order, we have: the same pattern as before (an ident and an expression), a pattern which takes just an expression, and a pattern which requires some literal tokens and an expression. Depending on the pattern matched by the arguments, my_print! will expand into one of the three possible bodies. Note that we separate arms of the pattern match with ;s.

Where exactly can macros be used? It's not quite anywhere - the reason is that the AST needs an entry for everywhere it is allowed to use a macro, for example there is ExprMac for a macro use in expression position. Macros can be used anywhere that an expression, statement, item, type, pattern, or impl item could be used. That means there are a few places you can't use a macro - as a trait item or a field, for example.

Macros can be used with different kinds of bracket too - foo!(...) is function-like and the most common. Foo {...} is item-like and foo[...] is array-like, used for initialising vectors (vec![...]), for example. What kind of bracket you use doesn't make much difference. The only subtlety is that item-like macros don't need a semi-colon in item position.

It's also possible to take a variable number of arguments (as is done in println!) by using repetition in macro patterns and bodies. $(...)* will match many instances of .... You can use tokens outside the parentheses as a separator, so $($x:expr),* will match a, b, c or or a (but not a, b, c,), whereas $($x:expr,)* will match a, b, c, or or a, (but not a, b, c). You can also use + rather than * to match at least one instance.

I'll delay discussion of scoping/modularisation until later.

For more details on macros, see the guide or The Little Book of Macros.

Macros in Rust pt1

macro_rules!

Macros in Rust pt2

Design patterns in Rust