[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]
simple word wrap
I wanted a simple word wrap in a single regexp, something I have tried
before and found frustratingly difficult.
This time I eventually achieved the goal:
s{
\G # begin where previous match left off
([\d\D]*?) # consume short lines
(?:(?<=^) | \G) # and pick up at the beginning of a line,
# or just after the previous replaced space
(.{1,79}) # match as many characters on this line as fit
\ + # followed by spaces
(?=(\S+)) # followed by (unconsumed) nonspace
}{ (length($2) + length($3) >= 79) ? "$1$2\n" : "$1$2 " }mexg;
In the process, I chased a long way down the garden path (reaching
(?p{ my $p = 80 - length $2; "(?=\\S{$p})" }) before turning back) until
I realised that while '((?=\S+))' wouldn't save anything, '(?=(\S+))'
would: it might be worth a mention of this in the docs or amongst the
examples.
I also wished for a couple of other extensions in the process:
1) pos() already set and modifiable at the point of replacement; I got
weird results (not to my great surprise) when I tried this:
s{
\G # begin where previous match left off
([\d\D]*?) # consume short lines
(?:(?<=^) | \G) # and pick up at the beginning of a line,
# or just after the previous replaced space
(.{1,79}) # match as many characters on this line as fit
\ + # followed by spaces
(\S+) # followed by nonspace, to be unconsumed
}{
warn "pos: ", pos(), "\n";
pos() -= length($3);
$1 . $2 . ((length($2) + length($3) > 79) ? "\n" : "")
}mxge;
2) a facility to step through s///g one at a time, as with m//g in a
scalar context: this would have let me do something like
pos() -= length($3) while s///g;
3) a facility to examine the modified rather than the original string
in later matches of a s///g: this might have allowed me to make the
'(?:(?<=^) | \G)' clause clearer.
Happy new year and all that,
Hugo
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]