[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]
[PATCH English.pm] Removing the regex nastiness (finally)
Laptops and long commutes means you actually get to do some of the
things in your TODO list. Being bored, I sat down and devised a way
to write English.pm such that its no longer a global performance drag
on regexes.
$MATCH, et al are exported as trick tie'd variables. On their first
use they replace themselves with their actual intended values. Passes
all t/lib/english.t tests.
I also added a $VERSION number (1.00 for lack of a better term), moved
the POD after the __END__ block for slightly faster loading and
altered the docs to remove the "this module is evil" warnings.
Unfortunately, t/lib/english.t segfaults on my copy of 5.005_63 (works
fine on 5.005_03). I suspect I've uncovered a bug in tying. That'll
be my next post.
--- English.pm 2000/01/21 16:16:57
+++ English.pm 2000/01/21 16:48:35
@@ -1,42 +1,10 @@
package English;
+$VERSION = 1.00;
+
require Exporter;
@ISA = (Exporter);
-=head1 NAME
-
-English - use nice English (or awk) names for ugly punctuation variables
-
-=head1 SYNOPSIS
-
- use English;
- ...
- if ($ERRNO =~ /denied/) { ... }
-
-=head1 DESCRIPTION
-
-You should I<not> use this module in programs intended to be portable
-among Perl versions, programs that must perform regular expression
-matching operations efficiently, or libraries intended for use with
-such programs. In a sense, this module is deprecated. The reasons
-for this have to do with implementation details of the Perl
-interpreter which are too thorny to go into here. Perhaps someday
-they will be fixed to make "C<use English>" more practical.
-
-This module provides aliases for the built-in variables whose
-names no one seems to like to read. Variables with side-effects
-which get triggered just by accessing them (like $0) will still
-be affected.
-
-For those variables that have an B<awk> version, both long
-and short English alternatives are provided. For example,
-the C<$/> variable can be referred to either $RS or
-$INPUT_RECORD_SEPARATOR if you are using the English module.
-
-See L<perlvar> for a complete list of these.
-
-=cut
-
local $^W = 0;
# Grandfather $NAME import
@@ -106,10 +74,10 @@
# Matching.
- *MATCH = *& ;
- *PREMATCH = *` ;
- *POSTMATCH = *' ;
- *LAST_PAREN_MATCH = *+ ;
+ tie $MATCH, 'English::Evil', 'MATCH';
+ tie $PREMATCH, 'English::Evil', 'PREMATCH';
+ tie $POSTMATCH, 'English::Evil', 'POSTMATCH';
+ tie $LAST_PAREN_MATCH, 'English::Evil', 'LAST_PAREN_MATCH';
# Input.
@@ -184,4 +152,81 @@
# *OFMT = *# ;
# *MULTILINE_MATCHING = ** ;
+
+# Here we set up suicidal variables which self-destruct on their first
+# use. This protects against the use of English causing regex
+# inefficiencies.
+package English::Evil;
+
+%Evil_Vars = (
+ MATCH => '&',
+ PREMATCH => '`',
+ POSTMATCH => "'",
+ LAST_PAREN_MATCH => '+',
+);
+
+sub TIESCALAR {
+ my($proto) = shift;
+ my($self) = shift;
+ bless \$self;
+}
+
+sub FETCH {
+ my($self) = shift;
+ my($caller) = caller;
+
+ # Replace myself with the evil in question.
+ *{$caller.'::'.$$self} = *{$Evil_Vars{$$self}};
+
+ return ${$caller.'::'.$$self};
+}
+
+sub STORE {
+ my($self, $val) = @_;
+ my($caller) = caller;
+
+ # Replace myself with the evil in question.
+ *{$caller.'::'.$$self} = *{$Evil_Vars{$$self}};
+
+ ${$$self} = $val;
+
+ # XXX Is this correct behavior?
+ return ${$caller.'::'.$$self};
+}
+
+
1;
+__END__
+=pod
+
+=head1 NAME
+
+English - use nice English (or awk) names for ugly punctuation variables
+
+=head1 SYNOPSIS
+
+ use English;
+ ...
+ if ($ERRNO =~ /denied/) { ... }
+
+=head1 DESCRIPTION
+
+This module provides aliases for the built-in variables whose
+names no one seems to like to read. Variables with side-effects
+which get triggered just by accessing them (like $0) will still
+be affected.
+
+For those variables that have an B<awk> version, both long
+and short English alternatives are provided. For example,
+the C<$/> variable can be referred to either $RS or
+$INPUT_RECORD_SEPARATOR if you are using the English module.
+
+See L<perlvar> for a complete list of these.
+
+=head1 CAVEATS
+
+You should I<not> use this module in programs intended to be portable
+among Perl versions. The problem of English causing regex inefficiencies
+has been solved.
+
+=cut
--
Michael G Schwern schwern@pobox.com
http://www.pobox.com/~schwern
/(?:(?:(1)[.-]?)?\(?(\d{3})\)?[.-]?)?(\d{3})[.-]?(\d{4})(x\d+)?/i
- Follow-Ups from:
-
Hugo <hv@crypt.compulink.co.uk>
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]