(Message perl:6596)
showproc perl/6596
/home/mjd/bin/mailpager /home/mjd/MH-Mail/perl/6596
Replied: Wed, 01 Aug 2001 00:45:48 -0400
Return-Path: mjd@plover.com
Return-Path: <mjd@plover.com>
Delivered-To: mjd-filter-deliver2@plover.com
Received: (qmail 25545 invoked by uid 119); 31 Jul 2001 20:48:00 -0000
Delivered-To: mjd-filter@plover.com
Received: (qmail 25534 invoked by uid 119); 31 Jul 2001 20:47:57 -0000
Delivered-To: mjd@plover.com
Received: (qmail 25527 invoked by uid 119); 31 Jul 2001 20:47:57 -0000
Message-ID: <20010731204757.25526.qmail@plover.com>
To: Dan Sugalski <dan@sidhe.org>
cc: Mark-Jason Dominus <mjd@plover.com>
Subject: Re: On general continuations 
In-reply-to: Your message of "Tue, 31 Jul 2001 15:02:45 EDT."
             <5.1.0.14.0.20010731145539.02899c90@24.8.96.48> 
Date: Tue, 31 Jul 2001 16:47:57 -0400
From: Mark-Jason Dominus <mjd@plover.com>


A closure is a subroutine which has captured the environment that was
in scope at the moment it was defined.  It can be used to implement
the 'yield' semantics you mentioned.  It's easy to implement:  When
the closure is created, just capture a pointer to the environment that
is in force.  In Perlspeak, we say that to construct an anonymous
function (the closure) we build a new, fresh CV which contains a
pointer to the current pad (the environment).  

Coroutines are like subroutines, except symmetric.  With subroutines
there's a caller and a callee.  The caller suspends execution and
transfers control to the callee; when the callee is complete, control
continues in the caller where it left off.  Coroutines are symmetric:
A might suspend execution and invoke B, which then continues where it
left off.  At any point B might suspend its own execution and continue
with A, which then continues where *it* left off.  Or there might be
seventeen separate coroutines, each working on some piece of the
problem and then invoking the others as necessary.  

An example of a case where this might be useful is when you have one
part of a program whose job is to produce data and another part whose
job is to consume it again.  When the data buffer is empty, the
consumer can invoke the producer to generate more; or when the buffer
is full the producer can invoke the consumer to empty it out again.
Either strategy can be implemented with regular subroutines, either
with the producer as a subroutine of the consumer, or vice versa, but
the problem is inherently symmetric and can sometimes be expressed
more cleanly with coroutines.  For some problems subroutines are
insufficiently general, and coroutines are required; see Knuth _Art of
Computer Programming_ volume I for examples.  One way to implement
coroutines is with continuations.

A continuation is like an extension of a closure.  A closure has
captured its environment, which means the variable bindings that were
in scope at the time it was created.  A continuation captures its
calling context as well.  It's like a function, but instead of
returning to its caller, it returns backwards in time to the
*expression* from which it was originally created.  

Perhaps the easiest way to think about this is to think in terms of an
implementation.  Whenever a continuation is created, you put a mark in
the current stack frame, as if you were expecting to catch an
exception that might be thrown to there.  The continuation itself
behaves just like an anonymous function of one argument, except that
instead of returning normally, it throws an exception which is caught
back at the place in the stack where the continuation was first created.

Continuations are usually presented with a function called 'call_cc'
which takes a coderef as its argument and invokes the coderef with the
current continuation as its argument.  Here's a simple example:

        sub foo { 
          my $arg = shift;
          $arg->(3);
          $arg->(2);
          return 17;
        }

        print call_cc(\&foo);

call_cc(\&foo) creates the current continuation and invokes &foo with
the continuation as the argument.  Inside &foo, the continuation is
placed into $arg.  $arg->(3) invokes the continuation, which, as I've
said, behaves like a function of one argument.  But when the
continuation is invoked, it behaves as though it threw an exception
back to the place where it was created.  This unwinds the stack back
to the 'print', and it appears to the program that call_cc() has
simply returned 3.  The $arg->(2) never gets a chance to execute, and
foo() never has a chance to return normally.  The example above simply
prints '3'.

OK, maybe that's no too weird---I think it might be the clearest
explanation of call-cc that I've ever seen.  So far it's just a weird
way of writing exceptions.  But continuations are more general than
that.  Exceptions can only be thrown *up* the stack.  Continuations
allow the execution state to be captured as an object and thrown
down the stack, or crossways, or any other way you want.  For example:

        sub foo {
          my $arg = shift;
          my $cc = call_cc { my $continuation = shift;
                             return $continuation; 
                           };
          if ($cc) { return $cc } 
          else { print $arg }
        }

        my $c1 = foo(1);   # prints nothing
        my $c2 = foo(2);   # prints nothing
        $c1->(undef);      # prints 2  
        $c1->(undef);      # prints 2 again


What happens here?  We call foo(1), which binds $arg to 1 and then
uses call_cc to capture the current continuation into $cc.  Since $cc
is an object, foo() then returns the continuation into $c1.  Similarly
$c2 holds a continuation in which $arg is bound to 2.

When we invoke $c1->(undef), the stack is restored to its state at the
moment of $c1's original creation.  In this previous version of
history, $arg was bound to 1 and call_cc was about to return a value
which would be bound to $cc.  But instead, the 'undef' is returned to
$cc.  The subsequent if-else notices that $cc is false and prints
$arg, which is 1.  The final call to $c1->(undef) does the same thing
all over again.

Note that this ruins my suggested implementation from above.  You
can't simply put a mark in the stack, because you mustn't unwind the
stack for as long as the continuation is alive; at any moment, someone
might invoke the continuation and jump back into the part of the stack
that you thought you were finished with.  One possible implementation
is to copy the entire stack and stick a pointer to it in the
continuation object.  Others may occur to you.

Here are two more examples of uses of continuations:

        call_cc { my $exit = shift;
                  for (54 0 37 -3 245 19) {     
                    $exit->($x) if $x < 0;
                  }
                  return;
                }
        # this returns -3

        sub listlength {
          my $obj = shift;
          call_cc { my $return = shift;    # I'm the continuation
                    my $r;
                    $r = sub { my $obj = shift;
                              if (ref $obj ne 'ARRAY') { $return->('wrong') }
                              elsif (@$obj == 0) { return 0 }
                              else { return 1 + $r->($obj->[1]) }
                             };
                    $r->($obj);
                  };
        }

        # print listlength [1,[2,[3,[4,[]]]]]    prints 4
        # print listlength [1,[2,[3,[4]]]]       prints 'wrong'


I got these last two out of the R4RS, the fourth standard for the
Scheme language, and translated them to Perl.  The note in the
standard that accompanies these says "The following examples show only
the most common uses of call-with-current-continuation.  If all real
programs were as simple as these examples, there would be no need for
a procedure with the power of call-with-current-continuation."

R5RS, the newest standard, may have better examples than these, which
are from R4RS; let me know if you want me to send you a copy.  You may
also want to take a look at:

        http://www-pu.informatik.uni-tuebingen.de/users/sperber/xemacs/next-generation/call-cc-for-elisp-hackers.html

If you're not scared yet, consider the following exercises: 

        Implement 'try'-'catch'-'throw' using call_cc.  
        Implement 'while' using call_cc.
        Implement coroutines using call_cc.
        Implement 'goto' using call_cc.
        Implement 'setjmp/longjmp' using call_cc.

It's really useful, but it's also pretty nearly incomprehensible.  I
put in the longjmp exercise to frighten you, because I know you're a C
programmer.

My own position, in short, is that closures are a great idea, and we
should keep them.  (They're easy to implement, because you just grab a
pointer to the current environment, and in fact Perl 5 already has
them and this is exactly what it does do.)  Continuations, on the other
hand, are hard to implement (you might have to copy the entire stack
and save it somewhere) and hard to use, and the prior art (from Common
Lisp) says that they're not worth doing, because the major uses (such
as exception handling) are either infrequent or better accomplished
using ad-hoc mechanisms such as what Perl 5 already has.

I have no position on coroutines except that if they're really needed,
the CL folks can tell you how to implement them efficiently *without*
full continuations.