Saturday, April 08, 2006

macros, parameters; binding, and reference

Danny Yoo had an interesting question on the plt-scheme mailing list recently. At first it seemed like your standard non-hygienic, “I want this to mean something in here” macro question. I used to group macros into three levels of “hygienicness”: the hygienic ones, the ones that are morally hygienic in that the names they introduce are based on their input, and the totally non-hygienic ones that have a fixed name that they stick into the program. An example of the middle set is define-struct, and an example of the third set is a loop construct that binds the name yield in its body.

The third class used to offend me from a purist’s (semi-purist?) perspective. But it’s a very reasonable thing to want to do. Consider the class macro and the names it uses to do interesting things: super, public, field, init, and so on. It depends on those particular names.

Danny Yoo was writing a generator library. He had a define-generator form, used like this:

(define-generator (name . args) . body)

and he wanted yield to have a particular meaning inside of the generator body. He had used the usual non-hygienic technique of creating the right yield identifier using datum->syntax, but he was asking for other ideas.

Dave commented that non-hygienic macros typically do not play nicely together; that is, it can be hard to write other macros that expand into them, because you have to think about what version of the code you want to bind the variable in, and you can’t always tell... it’s a mess. Dave recommended creating two versions of the macro: a non-hygienic front-end that forged the yield identifier, and a hygienic back-end that did the actual implementation. People who wanted to further abstract over generator definitions could use the hygienic version.

But there’s another way to look at it, and that’s what I replied to the mailing list. It got me thinking about the similarities between macros and normal programming, and the different techniques we use.

My answer begins here:

Another way you can do this that avoids the problems Dave mentioned is to use a “syntax parameter.” It’s like a parameter, but for macros. Here’s how:

Consider a related problem: what if we wanted yield to work in the dynamic extent of a call to a generator, rather than just lexically in the generator’s body? Here’s how you’d do it:

(define current-yielder
(make-parameter
(lambda (value) (error 'yield "not in a generator"))))
(define (yield value) ((current-yielder) value))

Then define-dynamic-generator wraps the generator function with something that sets the current-yielder parameter.

(define-syntax define-dynamic-generator
(syntax-rules ()
[(define-generator (name . args) . body)
(define (name . args)
(let ([name
(lambda (yielder)
(parameterize ([current-yielder yielder])
. body))])
(make-generator name)))]))

(As an aside, this version passes your test suite, too.)

Using parameters, you can change an issue of binding into an issue of common reference to a side-channel of communication like a parameter or a box.

Syntax parameters work the same way. You agree on a keyword:

(require racket/stxparam)
(define-syntax-parameter yield
(lambda (stx)
(raise-syntax-error #f "used outside of a generator" stx)))

Then you when you want some code to be able to use yield, like the body of your generator function, you use syntax-parameterize:

(define-syntax define-generator
(syntax-rules ()
[(define-generator (name . args) . body)
(define (name . args)
(let ([name
(lambda (yielder)
(syntax-parameterize
([yield
(syntax-rules ()
[(yield value)
(yielder value)])])
. body))])
(make-generator name)))]))

This use of syntax-parameterize changes the meaning of yield in the body to just call the supplied yielder function. If you want to be able to use yield as an expression by itself, rather than requiring it to be called, you can use syntax-id-rules to define an identifier macro instead.

By using syntax-parameterize, define-generator doesn’t bind yield; it refers to yield. Any macro that expands into a generator definition will still be producing a reference to yield, and references don’t suffer the same interactions with hygiene that bindings do.

1 comment:

Anonymous said...

I'm really enjoying your site. Please post more!

Would you be willing to suggest a tutorial or introduction to macros in scheme? I've read some, but, as you seem to be an expert, I was wondering what you might suggest.

Thanks so much,
Jared Nuzzolillo