Continuations, Coroutines, And All That

An informal introduction to creating advanced control structures in Parrot.

Version 1 (draft), Bob Rogers, 7-Jul-06.

1. Introduction

Parrot provides a wealth of tools for creating control structures, from the humble goto opcode to the enormously power of continuations. By ``advanced control structures'' we mean high-level language (HLL) constructs that transfer control between subs. So we will assume that goto needs no explanation, and take it from there.

Most of these tools are described in various specialized Perl documents, mostly PDDs, to which we will be referring heavily. Please bear in mind that the PDDs are definitive (when up-to-date), and have more detail, so this document will focus more on how to use the various tools in combination than in presenting a complete description of any tool in isolation.

[finish.]

2. Overview

[between-sub transfer. -- rgr, 7-Jul-06.]

2.1. Call and return

To introduce some terminology, let's consider a simple case of subroutine calling in Parrot. Consider the following Perl 5 code:

        sub inner {
            print "In inner.\n";
        }
        sub outer {
            print "In outer.\n";
            inner();
        }
        # main code
        print "Starting main.\n";
        outer();
        print "Done.\n";

When run, this scriptlet generates the following output:

        Starting main.
        In outer.
        In inner.
        Done.

If translated to PIR, the result might look something like this:

        .sub inner
                print "In inner.\n"
        .end
        .sub outer
                print "In outer.\n"
                inner()
        .end
        .sub main :main
                ## main code
                print "Starting main.\n"
                outer()
                print "Done.\n"
        .end

When executing, there are three subroutine calls made. The first one is when Parrot calls main (which it does because it is marked with :main). A Parrot_Context structure is created to record which sub is executing, and what the sub's local variables are (though we haven't any in this case). This is sometimes called an ``activation record'', and it corresponds to a stack frame in a stack-based implementation. In this simple case, Parrot behaves no differently from a stack-based interpreter (but that will change before long).

While inner is executing, Parrot will have constructed the following chain of Parrot_Context structures to keep track of the evolving computation:

    --> Context: inner
           |
     RetContinuation (pc 16)
           |
           +----> Context: outer
                     |
               RetContinuation (pc 30)
                     |
                     +----> Context: main
                               |
                         RetContinuation (no pc)
                               |
                               +----> [end]

The interpreter identifies the context for inner as the one that is currently executing, and keeps track of the fact that the ``current instruction'' is 2, relative to the instruction stream generated for the example1.pir file. Each executing context points to its ``current continuation'', which defines where to go when the sub returns; in this case, these are all RetContinuation PMCs. When execution falls off the end of the inner sub, it actually encounters a returncc instruction, which is always automatically added to the end of a sub that appears to need it. returncc invokes the continuation, which re-establishes the context to which it points as the one that is currently executing, resuming at the saved program counter (PC) value. The final instruction of main is actually end, which causes the interpreter to exit normally, so the outermost continuation is never invoked.

This linked-list structure is used to print a backtrace when an unhandled error is signalled. If inner were to get an error before it returned, the output might look something like this:

        rogers@rgrjr> parrot example1.pir 
        Starting main.
        In outer.
        In inner.
        Null PMC access in get_string()
        current instr.: 'inner' pc 2 (example1.pir:3)
        called from Sub 'outer' pc 16 (example1.pir:7)
        called from Sub 'main' pc 30 (example1.pir:12)
        rogers@rgrjr>

No information is printed for the last continuation, since that represents the exit from the main program.

Note that the call chain only points backwards, in the ``return'' direction; there are no pointers in the ``call'' direction. [need to say more about this. -- rgr, 7-Jul-06.]

2.2. Stack unwinding and rewinding

[need some better terminology. -- rgr, 7-Jul-06.]

[mention that continuations must also capture ``dynamic context,'' which includes dynamic bindings, rewinding actions, and error handlers. this might be a little too much foreshadowing, though. -- rgr, 7-Jul-06.]

3. Tools

3.1. Closures

Closures are not strictly control structures in themselves; for the most part, they behave like ordinary subs. However, closures permit state in the form of lexical variables to be shared between closely-related subs. This sharing allows what is conceptually a single sub to be broken up into two or more pieces, each of which can be called semi-independently. We will have occasion to use this technique frequently.

[bank-account example. -- rgr, 7-Jul-06.]

[should mention newclosure transformation of RetContinuation to Continuation somewhere -- possibly here? -- rgr, 7-Jul-06.]

[Could mention CP transformation for completeness. On the other hand, this is likely to be tedious for those who already know it, and a distraction for everybody else. -- rgr, 7-Jul-06.]

3.2. Dynamic binding

A variable is said to be bound dynamically if the the binding is visible outside of the lexical extent of the during the lifetime of the binding. Perl 5 calls this ``localizing'', after the local operator; Perl 6 calls it ``temporizing'' after its temp operator. The difference between dynamic binding and merely setting the value of a global is that in dynamic binding, the language undertakes to restore the old binding at the end of its scope. This combination of visibility outside of the lexical scope and guaranteed restoration means that dynamic binding must be part of the dynamic environment so that stack-rewinding operations can do the right thing. We will also see how this can be used to establish dynamically-scoped nonlocal exits.

[perl5 ``local $/'' example, or some such. -- rgr, 7-Jul-06.]

3.3. Error handling

Throwing and handling an error represents a common case of nonlocal exit. The place where the error is detected may be nested many levels deep; in order to recover, the handler may decide to ``unwind'' past some or all of these levels of call.

[need a good example. -- rgr, 7-Jul-06.]

[Interestingly, error handling is a negotiation between three snippets of code: The code that signals the error, the code that provides the recovery path, and the code that decides to handle this particular error by invoking that particular recovery path. All three may be written at different times, and by different parties. (KMP said something of the sort once, but I can no longer find the quote.) -- rgr, 7-Jul-06.]

3.3. Stack rewinding and cleanup

Some operations need to clean up after themselves, whether or not they complete successfully. If a caller throws an error, the stack could be unwound at any time, so any such operation must leave a snippet of code in the dynamic environment so that it can be executed at the right time to perform the cleanup. [more?]

An obvious solution would be to define an error handler that catches all errors, performs the necessary cleanup, and then rethrows the error. There are two places to implement this, in the handler and in the sub, but both of these are suboptimal. This is partly because they interfere with the normal error-handling mechanism, and partly because they only enforce cleanup when errors are signalled; an exit continuation would pass them by.

[unwind-protect example. -- rgr, 7-Jul-06.]

[example that shows how dynamic binding could be implemented in terms of unwind-protect. (or is that redundant?) -- rgr, 7-Jul-06.]

3.4. Continuations and nonlocal goto

[nonlocal goto example (no values) using a continuation stored in a lexical variable. -- rgr, 7-Jul-06.]

[return-from example (passing values). -- rgr, 7-Jul-06.]

[dynamically-scoped goto (i.e. Common Lisp CATCH and THROW). -- rgr, 7-Jul-06.]

3.5. Coroutines and context re-entrancy

[hoo boy. -- rgr, 7-Jul-06.]

4. References

5. Notes