/home/mjd/bin/mailpager 4628
Return-Path: perl5-porters-return-86993-mjd-p5p2=plover.com@perl.org
Delivery-Date: Wed Jan 07 23:36:54 2004
Return-Path: <perl5-porters-return-86993-mjd-p5p2=plover.com@perl.org>
Delivered-To: mjd-p5p2@plover.com
Mailing-List: contact perl5-porters-help@perl.org; run by ezmlm
Precedence: bulk
list-help: <mailto:perl5-porters-help@perl.org>
list-unsubscribe: <mailto:perl5-porters-unsubscribe@perl.org>
list-post: <mailto:perl5-porters@perl.org>
X-List-Archive: <http://nntp.perl.org/group/perl.perl5.porters/86993>
Delivered-To: mailing list perl5-porters@perl.org
Delivered-To: perl5-porters@perl.org
Date: Thu, 8 Jan 2004 00:43:15 +0100
From: Rafael Garcia-Suarez <rgarciasuarez@free.fr>
To: perl5-porters@perl.org
Subject: [PATCH] my $_
Message-Id: <20040108004315.104f51ff.rgarciasuarez@free.fr>
X-Mailer: Sylpheed version 0.9.4claws (GTK+ 1.2.10; i686-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Spam-Check-By: la.mx.develooper.com
X-Spam-Status: No, hits=1.1 required=7.0 tests=CARRIAGE_RETURNS,SPAM_PHRASE_00_01 version=2.44
X-SMTPD: qpsmtpd/0.26, http://develooper.com/code/qpsmtpd/

The following patch proposes an extension to Perl 5's syntax : the
ability to declare $_ (the well-known default scalar) as a lexical
variable.

In short, all operations that default to $_ will default to the lexical
version of $_ if there is one in scope; else they will use (as usual)
the global $::_. For example, the following snippet will print "1221":

    sub f { print };
    $_ = 1;
    { my $_ = 2; f; /(.)/; print; print $1; }
    print;

This is backwards-compatible, since the new behaviour is only trigerred
with a C<my $_> declaration (which is currently a compile-time error.)

In a block where $_ is lexicalized, you can "restore" access to the global
$_ simply by declaring C<our $_>.

The benefits are twofold, from a language point of view :
1. You can now use a lexical $_ without any effect at distance, without
   losing any conciseness
2. Using C<my $_> gives you a pristine undefined unmagical value in $_.
   This is not the case with C<local $_>, which retains any magic that
   $_ could have been bound to previously (causing obscure bugs.)

There is one unresolved issue :
Inside a map or a grep the global $_ is still used as an iterator (this
is not modified by this patch.) Practically, this means that map or grep
are unusable when a lexical $_ is in scope. There are two ways to fix
this : (a) use the lexical $_ (if available) as an iterator, (b) compile
the code inside the map/grep block to always use $::_. I haven't
made up my mind about the best solution yet. Opinions welcome.

I need also to silence the warning produced by C<our $_; my $_;>
("my" variable $_ masks earlier declaration in same scope) and similar
ones. They are pointless.

Please find my patch and a minimal test file below.
This is implemented by modifying the optree at compile-time, so there
is no run-time impact for code that doesn't use this feature. (Code that
does use it is likely to run faster, because access to pads is faster than
access to globals.)


Index: pp.c
===================================================================
--- pp.c	(revision 3049)
+++ pp.c	(working copy)
@@ -680,6 +680,8 @@ PP(pp_trans)
 
     if (PL_op->op_flags & OPf_STACKED)
 	sv = POPs;
+    else if (PL_op->op_private & OPpTARGET_MY)
+	sv = GETTARGET;
     else {
 	sv = DEFSV;
 	EXTEND(SP,1);
Index: toke.c
===================================================================
--- toke.c	(revision 3047)
+++ toke.c	(working copy)
@@ -6518,7 +6518,8 @@ S_scan_trans(pTHX_ char *start)
 
     New(803, tbl, complement&&!del?258:256, short);
     o = newPVOP(OP_TRANS, 0, (char*)tbl);
-    o->op_private = del|squash|complement|
+    o->op_private &= ~OPpTRANS_ALL;
+    o->op_private |= del|squash|complement|
       (DO_UTF8(PL_lex_stuff)? OPpTRANS_FROM_UTF : 0)|
       (DO_UTF8(PL_lex_repl) ? OPpTRANS_TO_UTF   : 0);
 
Index: opcode.pl
===================================================================
--- opcode.pl	(revision 3047)
+++ opcode.pl	(working copy)
@@ -493,9 +493,9 @@ regcreset	regexp internal reset	ck_fun		
 regcomp		regexp compilation	ck_null		s|	S
 match		pattern match (m//)	ck_match	d/
 qr		pattern quote (qr//)	ck_match	s/
-subst		substitution (s///)	ck_null		dis/	S
+subst		substitution (s///)	ck_match	dis/	S
 substcont	substitution iterator	ck_null		dis|	
-trans		transliteration (tr///)	ck_null		is"	S
+trans		transliteration (tr///)	ck_match	is"	S
 
 # Lvalue operators.
 # sassign is special-cased for op class
Index: op.c
===================================================================
--- op.c	(revision 3047)
+++ op.c	(working copy)
@@ -155,11 +155,11 @@ Perl_allocmy(pTHX_ char *name)
 {
     PADOFFSET off;
 
-    /* complain about "my $_" etc etc */
+    /* complain about "my $<special_var>" etc etc */
     if (!(PL_in_my == KEY_our ||
 	  isALPHA(name[1]) ||
 	  (USE_UTF8_IN_NAMES && UTF8_IS_START(name[1])) ||
-	  (name[1] == '_' && (int)strlen(name) > 2)))
+	  (name[1] == '_' && (*name == '$' || (int)strlen(name) > 2))))
     {
 	if (!isPRINT(name[1]) || strchr("\t\n\r\f", name[1])) {
 	    /* 1999-02-27 mjd@plover.com */
@@ -1673,6 +1673,7 @@ OP *
 Perl_bind_match(pTHX_ I32 type, OP *left, OP *right)
 {
     OP *o;
+    bool ismatchop = 0;
 
     if (ckWARN(WARN_MISC) &&
       (left->op_type == OP_RV2AV ||
@@ -1697,10 +1698,14 @@ Perl_bind_match(pTHX_ I32 type, OP *left
 	no_bareword_allowed(right);
     }
 
-    if (!(right->op_flags & OPf_STACKED) &&
-       (right->op_type == OP_MATCH ||
-	right->op_type == OP_SUBST ||
-	right->op_type == OP_TRANS)) {
+    ismatchop = right->op_type == OP_MATCH ||
+		right->op_type == OP_SUBST ||
+		right->op_type == OP_TRANS;
+    if (ismatchop && right->op_private & OPpTARGET_MY) {
+	right->op_targ = 0;
+	right->op_private &= ~OPpTARGET_MY;
+    }
+    if (!(right->op_flags & OPf_STACKED) && ismatchop) {
 	right->op_flags |= OPf_STACKED;
 	if (right->op_type != OP_MATCH &&
             ! (right->op_type == OP_TRANS &&
@@ -1801,7 +1806,15 @@ Perl_block_end(pTHX_ I32 floor, OP *seq)
 STATIC OP *
 S_newDEFSVOP(pTHX)
 {
-    return newSVREF(newGVOP(OP_GV, 0, PL_defgv));
+    I32 offset = pad_findmy("$_");
+    if (offset == NOT_IN_PAD || PAD_COMPNAME_FLAGS(offset) & SVpad_OUR) {
+	return newSVREF(newGVOP(OP_GV, 0, PL_defgv));
+    }
+    else {
+	OP *o = newOP(OP_PADSV, 0);
+	o->op_targ = offset;
+	return o;
+    }
 }
 
 void
@@ -5533,7 +5546,15 @@ Perl_ck_sassign(pTHX_ OP *o)
 OP *
 Perl_ck_match(pTHX_ OP *o)
 {
-    o->op_private |= OPpRUNTIME;
+    if (o->op_type != OP_QR) {
+	I32 offset = pad_findmy("$_");
+	if (offset != NOT_IN_PAD && !(PAD_COMPNAME_FLAGS(offset) & SVpad_OUR)) {
+	    o->op_targ = offset;
+	    o->op_private |= OPpTARGET_MY;
+	}
+    }
+    if (o->op_type == OP_MATCH || o->op_type == OP_QR)
+	o->op_private |= OPpRUNTIME;
     return o;
 }
 
Index: op.h
===================================================================
--- op.h	(revision 3047)
+++ op.h	(working copy)
@@ -135,9 +135,11 @@ Deprecated.  Use C<GIMME_V> instead.
 #define OPpTRANS_TO_UTF		2
 #define OPpTRANS_IDENTICAL	4	/* right side is same as left */
 #define OPpTRANS_SQUASH		8
-#define OPpTRANS_DELETE		16
+    /* 16 is used for OPpTARGET_MY */
 #define OPpTRANS_COMPLEMENT	32
 #define OPpTRANS_GROWS		64
+#define OPpTRANS_DELETE		128
+#define OPpTRANS_ALL	(OPpTRANS_FROM_UTF|OPpTRANS_TO_UTF|OPpTRANS_IDENTICAL|OPpTRANS_SQUASH|OPpTRANS_COMPLEMENT|OPpTRANS_GROWS|OPpTRANS_DELETE)
 
 /* Private for OP_REPEAT */
 #define OPpREPEAT_DOLIST	64	/* List replication. */
Index: pp_hot.c
===================================================================
--- pp_hot.c	(revision 3047)
+++ pp_hot.c	(working copy)
@@ -1195,6 +1195,8 @@ PP(pp_match)
 
     if (PL_op->op_flags & OPf_STACKED)
 	TARG = POPs;
+    else if (PL_op->op_private & OPpTARGET_MY)
+	GETTARGET;
     else {
 	TARG = DEFSV;
 	EXTEND(SP,1);
@@ -1958,6 +1960,8 @@ PP(pp_subst)
     dstr = (pm->op_pmflags & PMf_CONST) ? POPs : Nullsv;
     if (PL_op->op_flags & OPf_STACKED)
 	TARG = POPs;
+    else if (PL_op->op_private & OPpTARGET_MY)
+	GETTARGET;
     else {
 	TARG = DEFSV;
 	EXTEND(SP,1);
End of Patch.


#!./perl
# tests the C<my $_> feature

BEGIN {
    chdir 't' if -d 't';
    @INC = '../lib';
}

print "1..16\n";
my $test = 0;
sub ok ($$) {
    my ($ok, $name) = @_;
    ++$test;
    print $ok ? "ok $test - $name\n" : "not ok $test - $name\n";
}

$_ = 'global';
ok( $_ eq 'global', '$_ initial value' );
s/oba/abo/;
ok( $_ eq 'glabol', 's/// on global $_' );
{
    my $_ = 'local';
    ok( $_ eq 'local', 'local $_ initial value' );
    s/oca/aco/;
    ok( $_ eq 'lacol', 's/// on local $_' );
    /(..)/;
    ok( $1 eq 'la', '// on local $_' );
    ok( tr/c/d/ == 1, 'tr/// on local $_ counts correctly' );
    ok( $_ eq 'ladol', 'tr/// on local $_' );
    {
	my $_ = 'nested';
	ok( $_ eq 'nested', 'local $_ nested' );
	chop;
	ok( $_ eq 'neste', 'chop on local $_' );
    }
    {
	our $_;
	ok( $_ eq 'glabol', 'gains access to our global $_' );
    }
    ok( $_ eq 'ladol', 'local $_ restored' );
}
ok( $_ eq 'glabol', 'global $_ restored' );
s/abo/oba/;
ok( $_ eq 'global', 's/// on global $_ again' );
{
    my $_ = 11;
    our $_ = 22;
    ok( $_ eq 22, "our $_ is seen explicitly" );
    chop;
    ok( $_ eq 2, 'default chop chops our $_' );
    /(.)/;
    ok( $1 eq 2, 'default match sees our $_' );
}
/home/mjd/bin/mailpager 4671
Return-Path: perl5-porters-return-87036-mjd-p5p2=plover.com@perl.org
Delivery-Date: Thu Jan 08 17:23:17 2004
Return-Path: <perl5-porters-return-87036-mjd-p5p2=plover.com@perl.org>
Delivered-To: mjd-p5p2@plover.com
Mailing-List: contact perl5-porters-help@perl.org; run by ezmlm
Precedence: bulk
list-help: <mailto:perl5-porters-help@perl.org>
list-unsubscribe: <mailto:perl5-porters-unsubscribe@perl.org>
list-post: <mailto:perl5-porters@perl.org>
X-List-Archive: <http://nntp.perl.org/group/perl.perl5.porters/87036>
Delivered-To: mailing list perl5-porters@perl.org
Delivered-To: perl5-porters@perl.org
Date: Thu, 8 Jan 2004 17:21:00 +0000
From: Nicholas Clark <nick@ccl4.org>
To: perl5-porters@perl.org
Subject: Re: [PATCH] my $_
Message-ID: <20040108172100.GY94211@plum.flirble.org>
Mail-Followup-To: perl5-porters@perl.org
References: <20040108004315.104f51ff.rgarciasuarez@free.fr>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20040108004315.104f51ff.rgarciasuarez@free.fr>
User-Agent: Mutt/1.3.25i
X-Organisation: Tetrachloromethane
Sender: Nicholas Clark <nick@flirble.org>
X-Spam-Check-By: la.mx.develooper.com
X-Spam-Status: No, hits=-2.7 required=7.0 tests=CARRIAGE_RETURNS,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,SPAM_PHRASE_00_01,USER_AGENT,USER_AGENT_MUTT version=2.44
X-SMTPD: qpsmtpd/0.26, http://develooper.com/code/qpsmtpd/

On Thu, Jan 08, 2004 at 12:43:15AM +0100, Rafael Garcia-Suarez wrote:
> Please find my patch and a minimal test file below.
> This is implemented by modifying the optree at compile-time, so there
> is no run-time impact for code that doesn't use this feature. (Code that
> does use it is likely to run faster, because access to pads is faster than
> access to globals.)

If access to pads really is faster, would it be possible to make an entry
in every pad that points to the (real global) $_, so that all access to $_
were via the pad?

This is a very arm wavy question, as I've not looked at how the relevant
ops work. I'm just assuming that each pad could have a pointer to the real
$_, and up the refcount by one.

As I understand it local should still work, because it modifies the value
of $_, rather than replacing it. However, would there be a problem with
foreach [and likewise] that IIRC create a new $_ ?

Nicholas Clark
/home/mjd/bin/mailpager 4673
Return-Path: perl5-porters-return-87038-mjd-p5p2=plover.com@perl.org
Delivery-Date: Thu Jan 08 17:34:49 2004
Return-Path: <perl5-porters-return-87038-mjd-p5p2=plover.com@perl.org>
Delivered-To: mjd-p5p2@plover.com
Mailing-List: contact perl5-porters-help@perl.org; run by ezmlm
Precedence: bulk
list-help: <mailto:perl5-porters-help@perl.org>
list-unsubscribe: <mailto:perl5-porters-unsubscribe@perl.org>
list-post: <mailto:perl5-porters@perl.org>
X-List-Archive: <http://nntp.perl.org/group/perl.perl5.porters/87038>
Delivered-To: mailing list perl5-porters@perl.org
Delivered-To: perl5-porters@perl.org
Date: Thu, 8 Jan 2004 18:27:48 +0100
From: Rafael Garcia-Suarez <rgarciasuarez@free.fr>
To: perl5-porters@perl.org
Subject: Re: [PATCH] my $_
Message-Id: <20040108182748.7fca90b1.rgarciasuarez@free.fr>
In-Reply-To: <20040108172100.GY94211@plum.flirble.org>
References: <20040108004315.104f51ff.rgarciasuarez@free.fr>	<20040108172100.GY94211@plum.flirble.org>
X-Mailer: Sylpheed version 0.8.10claws (GTK+ 1.2.6; i686-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Spam-Check-By: one.develooper.com
X-Spam-Status: No, hits=-1.0 required=7.0 tests=CARRIAGE_RETURNS,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,SPAM_PHRASE_00_01 version=2.44
X-SMTPD: qpsmtpd/0.26, http://develooper.com/code/qpsmtpd/

Nicholas Clark wrote:
> If access to pads really is faster, would it be possible to make an entry
> in every pad that points to the (real global) $_, so that all access to $_
> were via the pad?

I think that it's possible to have a global or top-level our-pad that references
$_ : the equivalent of saying "our $_" at the top of each file. And then use this.

(Hmm, I have to check whether saying "our $_" with my patch correctly references
always $main::_.)

> As I understand it local should still work, because it modifies the value
> of $_, rather than replacing it. However, would there be a problem with
> foreach [and likewise] that IIRC create a new $_ ?

In other words, should the $_ created by for/map/grep always be our $_,
or should it reuse the current lexical $_ if there's one in scope ?
Open issue, as I mentioned.
/home/mjd/bin/mailpager 4674
Return-Path: perl5-porters-return-87039-mjd-p5p2=plover.com@perl.org
Delivery-Date: Thu Jan 08 18:09:21 2004
Return-Path: <perl5-porters-return-87039-mjd-p5p2=plover.com@perl.org>
Delivered-To: mjd-p5p2@plover.com
Mailing-List: contact perl5-porters-help@perl.org; run by ezmlm
Precedence: bulk
list-help: <mailto:perl5-porters-help@perl.org>
list-unsubscribe: <mailto:perl5-porters-unsubscribe@perl.org>
list-post: <mailto:perl5-porters@perl.org>
X-List-Archive: <http://nntp.perl.org/group/perl.perl5.porters/87039>
Delivered-To: mailing list perl5-porters@perl.org
Delivered-To: perl5-porters@perl.org
Date: Thu, 08 Jan 2004 19:06:33 +0100
From: "H.Merijn Brand" <h.m.brand@hccnet.nl>
To: Rafael Garcia-Suarez <rgarciasuarez@free.fr>
Subject: Re: [PATCH] my $_
Cc: Perl 5 Porters <perl5-porters@perl.org>
In-Reply-To: <20040108182748.7fca90b1.rgarciasuarez@free.fr>
References: <20040108172100.GY94211@plum.flirble.org> <20040108182748.7fca90b1.rgarciasuarez@free.fr>
Message-Id: <20040108190620.77FA.H.M.BRAND@hccnet.nl>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.07.04 [en]
X-Spam-Check-By: one.develooper.com
X-Spam-Status: No, hits=-1.0 required=7.0 tests=CARRIAGE_RETURNS,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,SPAM_PHRASE_00_01 version=2.44
X-SMTPD: qpsmtpd/0.26, http://develooper.com/code/qpsmtpd/

On Thu 08 Jan 2004 18:27, Rafael Garcia-Suarez <rgarciasuarez@free.fr> wrote:
> Nicholas Clark wrote:
> > If access to pads really is faster, would it be possible to make an entry
> > in every pad that points to the (real global) $_, so that all access to $_
> > were via the pad?
> 
> I think that it's possible to have a global or top-level our-pad that references
> $_ : the equivalent of saying "our $_" at the top of each file. And then use this.
> 
> (Hmm, I have to check whether saying "our $_" with my patch correctly references
> always $main::_.)
> 
> > As I understand it local should still work, because it modifies the value
> > of $_, rather than replacing it. However, would there be a problem with
> > foreach [and likewise] that IIRC create a new $_ ?
> 
> In other words, should the $_ created by for/map/grep always be our $_,
> or should it reuse the current lexical $_ if there's one in scope ?
> Open issue, as I mentioned.

I opt for the latter

-- 
H.Merijn Brand        Amsterdam Perl Mongers (http://amsterdam.pm.org/)
using perl-5.6.1, 5.8.0, & 5.9.x, and 806 on  HP-UX 10.20 & 11.00, 11i,
   AIX 4.3, SuSE 8.2, and Win2k.           http://www.cmve.net/~merijn/
http://archives.develooper.com/daily-build@perl.org/   perl-qa@perl.org
send smoke reports to: smokers-reports@perl.org, QA: http://qa.perl.org

