© Copyright 1999 M-J Dominus.
Hi. Today we're going to see a line of Perl code that looks like it should work, but doesn't. Here it is:
print "From: guitar@plover.com\n";
What could possibly be wrong with this?
Of course, since something is wrong, Perl will tell you:
In string, @plover now must be written as \@plover at ...
Or, in older versions of Perl (5.003 and earlier)
Literal @plover now requires backslash at ...
This can be a frustrating error message. It's so clear, and the clarity is annoying: Since Perl knows that the backslash is missing, why doesn't it just put it in for you? And why is it required anyway? And how come Perl doesn't deliver that message consistently? Sometimes you get it, sometimes you don't.
The basic problem here is easy to understand: Perl is trying to decide
whether @plover
should be taken literally or whether it should try to insert the value of
the array @plover
. But on top of this simple problem is piled layer upon layer of historical
complication. To unravel the history, let's take a trip in the Wayback
Machine, back to the very Dawn of Perl Itself... Sherman, set the dials for
1987!
In Perl 1 (and later, Perl 2) the situation was simple: Arrays didn't interpolate into double-quoted strings. The ambiguity that gives Perl problems in 1999 didn't exist. When Perl saw
print "From: guitar@plover.com\n";
it knew you meant to print out @plover
literally, and not to look for an array @plover
. If you really wanted to print out the elements of some array, say @array
, you'd have had to do something like this:
print "@array contains: [", (join " ", @array), "].\n";
Perl 1 was surprisingly limited by modern standards: $a[1]
wouldn't interpolate either; if @a
contained (4, 5, 6)
and $a
contained ouch
, then
print "$a[1]"
would print ouch[1]
.
Now let's jump forward to 1989, and Perl 3.
Perl 1 was recognizably Perl; you couldn't mistake it for anything else. But Perl 3 is the first version that really feels like Perl. Perl 3 introduced packages, sockets, tied hashes, and a lot of other stuff, including the ability to interpolate arrays and array values into double-quoted strings.
For the first time, Perl had to deal with the possible ambiguity of
print "From: guitar@plover.com\n";
and decide whether you wanted to interpolate the @plover
array or not. The example here isn't so good any more, because it's obvious
to a human that the @plover
should not be interpolated. But suppose it looked like this instead:
print "Three kinds of plovers are [@plover]\n";
To Perl, this looks just like the first example, so we can't expect Perl to read our minds and know when we want the arrays interpolated. We'll only see the first example again, but even so, please remember that the decision about whether or not to interpolate can go either way.
With the addition of the new array interpolation feature to Perl 3, of
course there was a compatibility problem: There were now pre-existing
programs written for Perl 1 and Perl 2 that used @
signs in strings, never dreaming that Perl would be trying to decide
whether or not to do array interpolation. Perl couldn't simply interpolate
every possible array, because that would have changed the meaning of a Perl
2 program that contained the `print' line above. Instead Perl 3 needed to
use a rule to decide when to interpolate and when not, and the rule had to
be convenient for Perl 3 programmers while still respecting code written
for Perl 1 and Perl2.
The rule that was chosen was this: By the time the string in
print "From: guitar@plover.com\n";
is evaluated, Perl knows whether or not you actually used the array
@plover
in your program. If you did, Perl will interpolate it into the string. But
otherwise, if you never used @plover
anywhere, Perl will assume that the @
should be taken literally.
What if you wanted to have an array named @plover
and still use
@plover
in a string without interpolation? The solution is familiar: Put a \
before the @
:
print "From: guitar\@plover.com\n";
This never interpolates, whether or not you have an @plover
array.
These rules persisted for a long time, through Perl 3 and Perl 4, until Perl 5 came out in 1994.
In Perl 5 there was a subtle change in the way strings worked. As you probably know, Perl is a `demicompiler', which means that it runs in two phases. The first phase is a compilation phase, in which it reads and parses your program and translates it into internal data structus that explain how to execute it. When the program is completely compiled, Perl enters the second phase, the `run' phase, in which it executes your program. Because of this, it makes sense to distinguish between the things that Perl does at compile time and the things it does at run time.
Perl 3 figures out how to interpolate strings at run time: It would get to the part of your program that constructed the string, and then it would look through the string for things to interpolate. This means that in
for (1 .. 1_000_000) { print "guitar@plover.com\n" }
Perl 3 would look through the string for things that look like arrays, see @plover
, and decide (not) to interpolate, over and over, one million times.
Clearly this is horribly inefficient because the string never changes; it's
better to figure out how to parse the string all at once, at compile time,
and figure out how to do the interpolation then, and not to worry about it
again.
But there's a problem with that. Perl 3 would decide whether or not to
interpolate @plover
based on whether or not you had used the
@plover
array somewhere else in the program. It could do this because it was making
the decision at run time, after it had already seen and compiled the entire
program. But if you want to make the decision earlier, at compile time, you
run into difficulties: If the Perl 5 compiler sees
print "From: guitar@plover.com\n";
on line 3, how can it know in advance whether or not the array
@plover
will be mentioned on line 997 when it hasn't read that far yet? Obviously
it can't. You might think it could solve the problem by reading ahead, but
it can't do that either, because @plover
might not be mentioned in the file at all, but rather in some other file
that is loaded in by `do' or `require' much later on, when the program is
actually running. So the old rule is unworkable; the information about
whether or not @plover
is used somewhere simply isn't available at compile time.
Since the rule had to change anyway, the authors of Perl decided to make it
simpler: @plover
would always be interpolated in a double-quoted string, unless there was a backslash
before it.
Unfortunately, this simple rule was a substantial change from Perl 3 and Perl 4, and a complete change from Perl 1 and Perl 2. It couldn't be implemented right away, because doing so would break thousands of old programs. For example, the line
print "From: guitar@plover.com\n";
which would print everything literally in every version of Perl up through Perl 4, would instead print
From: guitar.com
with the new rule, because the nonexistent empty array @plover
would be interpolated. Any program that depended on the old behavior would
change its output, and worse, the change would be silent, which means that it would be undiagnosed; you wouldn't know anything was
wrong until you suddently started getting mysteriously broken output from a
program that used to work.
Obviously, silently breaking thousands of old programs that had worked for years and years was not a viable option. But they had to be broken, because the old rule just wouldn't work any more. So the only option was to break them audibly instead of silently.
Perl 5 will always interpolate arrays in double-quoted strings, unless they're preceded by a backslash. But it has to worry that perhaps it's compiling an old program from Perl 4 or earlier that didn't want the interpolation to occur. When it sees something like
print "From: guitar@plover.com\n";
it has to make a decision about whether interpolation is safe or not. If it
has seen you use @plover
already in the part of the program that it has already compiled, it can
safely assume that @plover
should be interpolated, because that's what Perl 3 would have done; Perl 5
and Perl 3 behaviors are the same in that case. But if it hasn't seen you
mention @plover
earlier, it can't be sure, because Perl 3 would have chosen to interpolate
or not based on whether @plover
appeared later, and it doesn't know yet what Perl 3 would have seen later
on. So rather than finish the compilation and leave you with a program that
might produce the wrong output, it gives up and says
In string, @plover now must be written as \@plover at ...
and refuses to run your program until you've cleared up the ambiguity. If
you don't want any interpolation, you have to insert the backslash. If you do want interpolation, you have to find some way to warn the compiler that
you're planning to use the array
@plover
before you actually do use it. All you have to do is mention it somehow.
Saying this:
$plover[17] = 'Semipalmated'
will do, and so will this:
@plover = ();
If you don't have a real reason to mention the array in advance, you can `declare' it at the top of your program by saying this:
use vars '@plover';
Any of these will warn the compiler that @plover
is a real array, so it knows that interpolating it into strings is safe,
because any version of Perl back to Perl 3 would have done the same thing.
Someday, when all Perl 4 programs are buried safely in the La Brea tar pits,
this error message might be removed, and Perl will just assume that all
strings that contain @
signs will interpolate arrays, and that everyone who wants a literal @
sign in a string will always precede it with a backslash. But for now, the
world is still full of good Perl 4 code that works and works well, and it
can't be silently ruined.
Year 2000 Update: This article was written in May, 1999. In May, 2000 I submitted a patch to Perl to give it the behavior described in the previous paragraph, so Perl 5.6.1 and later will always interpolate arrays in double-quoted strings, and if you write this:
print "From: guitar@plover.com\n";You will get
From: guitar.com
with no fatal error message. However, if you have optional warnings enabled, Perl will issue this warning message:Array @plover will be interpolated in string
Now let's return the the questions I posed at the beginning of the article.
``Since Perl knows that the backslash is missing, why doesn't it just put it in for you?''
It doesn't really know that it's missing; it's not sure if you wanted the array to be interpolated or not.
``And why is it required anyway?''
Because starting with Perl 5, @array
in a double-quoted string
always interpolates the array. To prevent this, you must use the backslash.
``And how come Perl doesn't deliver that message consistently?''
If you happened to mention the array before you mentioned the string, Perl can be sure that nothing will break if it interpolates the array into the string, so it goes ahead and does that. But if not, it can't be sure, and it has to take the safe route and warn you that something might be wrong.
Finally:
``What can I do to fix it?''
If you wanted the array to be interpolated, then declare it at the top of your program with `use vars'.
If not, put a backslash before the @
sign.
Return to: Universe of Discourse main page | What's new page | Perl Paraphernalia | PerlMonth Online Magazine