Tuesday, September 27, 2011

definitions vs enclosing binding forms

There are two kinds of binding forms in Racket: definitions and enclosing binding forms. The scope of a binding introduced by an enclosing binding form is entirely evident: it’s one (or more) of the form’s sub-terms. For example, in

(lambda (var ...) body)

the scope of the var bindings is body. In contrast, the scope of a definition is determined by its context: the enclosing lambda body, for example, or the enclosing module—except that scope is too simple a term for how bindings work in such contexts. Enclosing binding forms are simpler and cleaner but weaker; definition forms are more powerful, but have a more complicated binding structure. Definitions also have the pleasant property of reducing rightward code drift.

Definitions are more powerful than enclosing binding forms because they can be used to construct more binding structures. Definitions are composable building blocks for environments; a crucial property of definitions is that they are complete entities on their own, absent the expressions (or more generally, forms) that will be used in their bindings’ scope. The expressions in the scope of an enclosing binding form, on the other hand, are fixed; they’re part of the term.

It is easy to construct an enclosing form given a definition form, but it is difficult to construct a definition form given an enclosing form. For example, consider the creation of structure types; there is a struct definition form that defines the constructor, predicate, accessors, etc. Here’s how to turn that into an enclosing binding form:

(let-struct name (field ...) body)
(let () (struct name (field ...)) (let () body))

We just use let to open up a new local definition context, use struct to put its bindings in that context, and place body in that context—but within its own let, to prevent it from perhaps attempting to insert its own conflicting bindings into the definition context.

If you’re finicky, perhaps you’ve noticed that this is actually more of a letrec-struct, since the scope of the introduced names includes any sub-expressions of the struct form—or would, if our macro supported the same options that struct does. We could give let-struct true let-scoping instead of letrec-scoping by lifting any sub-expressions out of the definition context. For example:

(let-struct name (field ...) #:inspector insp-expr body)
(let ([insp insp-expr])
(struct name (field ...))
(let () body))

More work, yes, but still feasible. (Alternatively, I conjecture that we could use marks and rename-transformers in a clever way to hide the struct names from insp-expr but make them available to body. See if you can work it out. You might find Syntactic Abstraction in Component Interfaces helpful. Alternatively, you could try using internal-definition-contexts; see the implementation of racket/splicing, for example.)

What would it take to go the other direction? Here’s a first stab at it:

(struct name (field ...))
(define-values (name name? name-field ...)
(let-struct name (field ...)
(values name name? name-field ...)))

Bzzzt, wrong: struct is supposed to bind name as a macro that not only acts as the constructor but also records compile-time information about the struct type that can be used by other macros like match, the struct-out provide form, etc.

It’s easy to extrude a value from the scope it was created in... at least, many modern languages have mostly figured it out, although some still manage to bungle it. But there’s no way to extrude a macro from a local scope to an outer scope. And if there were, we’d have to rethink what references to local names no longer in scope meant—that’s why I called it extrusion: because it reminds me of the type extrusion problem.

We could, of course, just re-compute the macro part of the struct expansion. But now we’re up to 1.5 implementations of struct-related macros. Better to do the work once in the definition form and naively reuse it to create let-struct (if we even care about let-struct). Advantage: definitions.

This power comes at a cost, though, namely the two-pass expansion of definition contexts.

(to be continued...)

No comments: